Azure Data Factory – “Execute SQL Task”


Scenario


How to run single SQL commands using Azure Data Factory (ADF)? Sometimes I need just that. No fancy requirements just execute a simple UPDATE for example.

Investigation


Unfortunately there is no such thing as our good old friend the Execute SQL Task in SSIS. I would like to think this is an oversight but I’m not sure that this functionality will come into ADF. I have looked around online and there are some workarounds but the answers to this question are scarce. But don’t despair I try to collect a few methods for you, dear Reader until (if ever) a shiny new icon “SQL” appears among the different activities of an ADF pipeline.

Workarounds


Stored Procedure – redesigning your SQL command as a stored procedure

This is my least favourite but the one that is most frequently recommended. I don’t like wrapping a possibly very simple SQL command into a stored procedure as I think there is too much overhead with maintaining the code, putting the logic outside ADF, … honestly: too much hassle but good to keep in mind as a last resort.

Stored Procedure with a twist

To execute a stored procedure that stored procedure needs to exist. I could either create it myself (see previous option) or I can use one that’s available out of the box in a SQL Server database! The idea came to me when looking through a totally different stored procedure that puts together a SQL statement dynamically. Maybe you already know now before I write it down:

sp_executesql

https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql (it applies to all 4 SQL Server platform no matter if you have it on-prem or in the cloud)

EXECUTE sp_executesql N' your SQL command comes here '

Lookup activity

This is the third way I have found. Though it’s not designed to simply execute SQL commands but to execute them and return a piece of data in some format. So for SELECT statements it’s good especially if you want to use the results later in your pipeline (it’s designed for that). But then how to run a data manipulation command like UPDATE?
Well there is no real requirement what happens between issuing the command and returning a piece of data. So it’s easy to run an UPDATE statement with a valid OUTPUT clause!
https://docs.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql

UPDATE your_table SET your_column = your_column * 15
OUTPUT Inserted.your_column
WHERE IsGST = 'True'

Dynamic SQL

This was a tricky one to solve. Again I got quite annoyed that Azure Data Factory does not have the native functionality to execute a SQL statement. I spent a lot of time trying to make the sp_executesql version work because that system stored procedure is supported by all SQL Server platforms according to https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql See that all four has a green tick next to them in the APPLIES TO row. Despite that it didn’t behave correctly despite my diligent efforts of disciplining the system.

So I ended up being in a situation that my dynamic SQL command is nicely crafted by a stored procedure that runs very obediently in database A. I used the Lookup task to get that in the way it’s intended to be used. So far so good. The SQL query appeared nicely in the output of that task.

I don’t want to go through all the hurdles as that would be more confusing than useful so let me just show how I managed to get it working:

ADF pipeline to extract and run Dynamic SQL
  1. the lookup task runs a stored procedure in Database A that returns a SQL query with a dummy SELECT 1 at the end as the Lookup task MUST return something.
  2. the returned query gets assigned to a SQLString variable using the expression @activity(‘LookupDynamicSQL’).output.firstRow.ColumnName
    I tried using the output directly in step 3 but it didn’t work for some reason
  3. another Lookup task for Database B (a separate dataset) that is set to use a Query and the query itself is @variables(‘SQLString’)

And it runs nicely. I actually have a ForEach loop wrapped around these three steps as I need to run them for multiple items and they cooperate nicely. ‘Why wouldn’t they?‘ – you could ask but I have seen weirder errors going against common sense so I’m simply happy to tell you that it works that way, too.

One thought on “Azure Data Factory – “Execute SQL Task”

  1. Very useful info. I have the same challenge which I have an update query against Oracle and I have no idea how to return something for lookup activity. Oracle doesn’t have output clause and I tried returning into which doesn’t work either. Anyway thanks for the sharing.

Leave a comment