2007-07-13

Being a SAS Data Warehouse Administrator

Somebody asked me what made a SAS data warehouse programmer different from a normal SAS programmer—was it the amount of SAS knowledge I had?

The short answer was no, it’s all about experience. Here’s a rough longer answer based on my data warehousing experience at 4 different organisations:
  • Strong base SAS and macro is essential, this is what the majority of data warehouse jobs use.

  • Also important is a strong knowledge of how SAS works internally. What gets written to the log in the following sample, and why?
    data _null_;
    log 'hi';
    set work.test(obs=1);
    run;
  • A general knowledge of SAS and the ability to learn more as needed: e.g. I also use ODS and Graph on a regular basis.

  • General knowledge: operating system knowledge (mainframe/Unix/Windows), SQL, reading from different formats (CSV, XML…), writing to various formats (Excel, HTML, XML…), version control, backup…

  • Awareness of machine resources and scheduling

  • Understand the big picture: can you justify creating an index (space and time vs. potential usage & savings)? Should you create another data set for new data or combine with an existing one? When should you create a macro for a common routine vs. a pre-summarised data set?

  • Consistency: users need to be able to transfer their knowledge from one part of the warehouse to another. Don’t make it hard for them.

  • Work within the existing warehouse setup: if the naming standards are CamelCase (my preference), don’t create Underscore_Names.

  • It’s about doing it right

  • Documentation. The ‘d’ word! I try to create clear and commented code, easy-to-use data sets, and document what I’ve done. My goal is to make myself redundant (which hasn’t worked!)

  • Communication: essential within your team, to users, and management. Sometimes users and/or managers will have an exact requirement but it should be implemented in a different way to fit the overall picture—you need to spot these and be able to explain why.

  • After hours work: users don’t like losing the warehouse to maintenance during working hours!

No comments: