SQL Insert if not exists is a popular task. I often use it myself in "ad-hoc updates" that is part of post deployment which needs to be re-runnable. If I need to perform more complex insert and updates which are common in data warehouses than I suggest to check MERGE statement and look into Slowly Changing Dimensions concept of data warehouses which is often used with SSIS (There is free third part component to do Kimball SCD from codeplex. SSIS has its own SCD but I do not recommend it at time of writing this post).
I have several different examples so feel free to go the most suitable one for you:
In the first example perform SQL Insert if row does not exist. See below screenshot. I have two tables. The table on the left is Employee table which is where I want to insert new data and the table on the right is my source which shows me any new employees. As you can see there is one new employee that needs to be inserted.
In order to perform SQL Insert if not exists I need to use my source table and exlude any rows that already exist in my target table (Employee table). I do that using popular method which uses left outer join and filters for NULLs (exists only on one side).
See below the SQL solution that inserted 1 row.
NOTE: I used first name and surname in this case (which is simple example but goes wrong when you have the same people with the same name). I could have also used just ID in this case but ID is not always best option. When you use this method ensure that you specify correctly the join.
The disadvantage of this method is that it does not perform updates and in this cases you should look into using merge or SCD approach.
The second example I have is simplier to write however it has its disadvantages. The basic idea is to delete rows that you have to ensure you don't duplicate it and insert all new rows. See below syntax
The disadvantage of this method is that you actually delete row so if you have identity (which is not the case in my example) that would generate new ID which is not what you are after. Another disadvantage is that you perform two actions so you would have to write transaction to do "all or nothing" in other words you don't want to perform delete without insert which might happen for multiple reasons. So would I ever use it? I actually do use it frequently for configuration tables; SSIS package configuration to be precise. Where I do want to delete old value and don't want to perform update and because I validate my output I don't bother with transactions. So in some cases this approach is simple and quick to implement.
There are other ways to do the same thing which does not matter too much as long as you chose the right approach to meet your requirements and avoid "unexpected issues".
Hope that helps