Love to code, although it bugs me.

Stack Exchange PowerShell Loader

No comments

Following a session I attended on a SQL Saturday event and reading a post from Brent Ozar, I got the idea for a new github project, a loader utility script made to import the Stack Exchange data dumps, in XML format, into a given Microsoft SQL Server database.


The data dumps are an anonymized dump of all user-contributed content on the Stack Exchange network. Each site is formatted as a separate archive consisting of XML files zipped via 7-zip using bzip2 compression. Each site archive includes Posts, Users, Votes, Comments, PostHistory and PostLinks.


You can find and choose to download the data dumps here: https://archive.org/details/stackexchange


image



The given DDL file for the schema and database object creation is adapted from Jeremiah Peschka’s soddi project. You can check it out here: https://github.com/peschkaj/soddi


image



In order to run the script, you should have Powershell v3 and MSSQL Server Management Studio 2012 or above:


image



You can chek the project’s page and get the source here:


http://mjlmo.github.io/SEPSLoader/


image



Happy scripting!

No comments :

Post a Comment