Sitemap File Parser (C#)

I need a console application written in C# where I can pass in a one or many domain names, and it will locate, download, parse, and load the sitemap info into a SQL database. The complexity of this project is in the details. There are a variety of non-standard sitemap locations that you’ll need to figure out – for example, a wordpress site can use several different plugins to generate a sitemap, and the url and format of the sitemap varies. The application will need to be able to handle more than just the most obvious “standard” case.

That said most seem to follow the spec at

Here are a few example sitemaps that we’ll need to be able to deal with:

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

Here’s the process:

1. Get the [url removed, login to view] file, if there is one. Parse it to see if there’s a link to the sitemap.xml. If so, woot.

2. If there’s no [url removed, login to view] file, guess as many sitemap urls as you can think of until you find one.

3. If you never find one, return “no sitemap found”.

4. Once you’ve got a sitemap file, figure out which which type it is. Depending, you may need to:

a. Decompress it.

b. Download more files, maybe decompress them.

5. Load the data into a .NET Dataset. Bulk insert the Dataset into a SQL table (I can provide the source code for that functionality if necessary).

I care about 2 things in the sitemap: “loc” – the url, and “last mod” – the date / time it was modified.

I’d also need to log the current date/time that the sitemap is being processed, the location of the sitemap file, and the domain name.

Coding Standards

I’ll review the code, so make it look nice, maintainable, and well documented.

Structurally, the functional code needs to be separate from the console (UI code). The parsing, downloading, unzipping, and db logic should be in a dll that is separate from the console exe.

If you are thinking that this project should have fewer than 4 class files, I don’t think you are organizing it well.

Performance is important to me. Understand data structures and strings. Don’t do stupid things with strings. Converting a 10MB+ file into a string, and then searching it is retarded. Don’t embarrass yourself.

I like readable code. Assume I’m dumb. Tertiary operators are too complicated; use a full if – including curlies. Single character variables are okay in for loops and functions with 6 lines or less – and pretty much no other circumstance.

Unit Tests

You’ll need to write NUnit test cases for each of the various sitemap types.


I’ve tried to clearly spell out what I want in this spec. But, I’ve only spent an 30 minutes writing it. If I made it *perfect* and spent 5 days writing it, it would be 30 pages. You’d be overwhelmed and wouldn’t read it, and wouldn’t know how to bid. Spending 5 days writing a spec; I should have written the code. Here’s my point:

Expect some scope creep – probably like 20-25%. I haven’t spelled out all of the different sitemap “types” and I anticipate you’ll send me some code, and I’ll find some sitemaps that break it. I won’t like your DB schema, that sort of thing. Bid accordingly.

Taidot: .NET, C# -ohjelmointi, SQL, Tietojen kaavinta verkosta

Näytä lisää: parse sitemap, parsing sitemap, sitemap parser, sitemap example, parsing wordpress site, parser sitemap, parse sitemap file, console generate sitemap, sitemap look, xml sitemap, parse xml sitemap, www woot com, www linkedin com, www linkedin, writing unit tests, writing to a file in c, writing a performance review, wordpress console, where to find xml files, where do we find xml, what is time complexity in c, what is time complexity, what is sql coding, what is data structures in c, what is data structures

Tietoa työnantajasta:
( 47 arvostelua ) United States

Projektin tunnus: #4269260

Myönnetty käyttäjälle:


Hi, I would suggest to make a sitemap path patterns configurable, and provide list from outside. In this case user can add missed sitemap patterns if application can't find in [url removed, login to view] or in default paths. I am ex Lisää

$500 USD 14 päivässä
(12 Arvostelua)

9 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_26% %project_currencyDetails_sign_sub_27% tähän työhön


eCommerce,ASP.Net,C#,MVC,WCF,WPF,Silverlight and HTML CSS JQuery expert here. My Portfolio here [url removed, login to view]

$500 USD 15 päivässä
(36 arvostelua)

Hi, Experts team of SEO/Wordpress/PHP/joomla/Drupal developers and designers. Thanks Gaurav

$650 USD 25 päivässä
(31 arvostelua)

##### YOUR SEARCH ENDS HERE! ##### GET IT RIGHT THE FIRST TIME!. Check the message and contact us. SI TEAM.

$750 USD 14 päivässä
(13 arvostelua)

Hi, I have very strong hands on C#.NET. I could do this project. Please check my profiel for more details. Thanks! Regards

$750 USD 20 päivässä
(3 arvostelua)

Our updated bid see private message also. Thanks.

$735 USD 13 päivässä
(2 arvostelua)

Hi, I am an experienced .net developer an will deliver the best quality and most maintainable code possible. please feel free to check my profile. ready to get the job done

$300 USD 10 päivässä
(3 arvostelua)

Hi, Please check private message. Thanks

$750 USD 15 päivässä
(0 arvostelua)

I have in depth experience with file parsing in C# and Java. Have experience in reading/writing to databases. Have experience in developing a 3-tier architecture web-application. I understand that the way you Lisää

$450 USD 30 päivässä
(0 arvostelua)

I have 10 years experience in C#.Net/Java [url removed, login to view]

$750 USD 30 päivässä
(0 arvostelua)

Hi! I have over 5 years of experience in the .NET sphere. I can complete this project producing high quality code.

$333 USD 12 päivässä
(0 arvostelua)