Cancelled

HTML Code Parser (C# .NET)

hi, i need a Windows Forms Application in c# to do the following:

RECEIVE TEXT INPUT: i will paste a ULR into a textbox

example: "http://www.amazon.com/gp/offer-listing/0521347718/ref=sr_1_5_up_1_pap_olp?s=books&ie=UTF8&qid=1340742213&sr=1-5&condition=used"

PROCESSING: the program must identify specific elements which are assumed to be within this html page and extract it.

for instance, on the URL above, there will be a list of sellers. the "Seller: " text will appear, and directly after it, a hyperlinked text with the name of the seller. thus, within corresponding HTML elements in the HTML source code of that page, various seller names will appear that must be extracted. here are the elements i wish to extract:

[1] the title of the item. this only appears at the top of the page. items 2-4 appear multiple times.

[2] the price in the first column (without the currency symbol)

[3] the item condition

[4] the seller name that appears directly after the "Seller: " text.

these elements are contained within their respective HTML elements in the source. the program must parse the HTML code returned for these elements.

OUTPUT: i want the 4 extracted variables to be output in a text file in TSV (tab separated values) format. the first line of the TSV file is the exception since there is only one title in the URL output.

from the 2nd line onward, each line must contain one price, one condition, and one seller name separated by tab - lines must end with a line break/enter and the next entry on the next line etc.

can this be done in a 30min gig?

regards,

## Deliverables

hi, i need a Windows Forms Application in c# to do the following:

RECEIVE TEXT INPUT: i will paste a ULR into a textbox

example: "http://www.amazon.com/gp/offer-listing/0521347718/ref=sr_1_5_up_1_pap_olp?s=books&ie=UTF8&qid=1340742213&sr=1-5&condition=used"

PROCESSING: the program must identify specific elements which are assumed to be within this html page and extract it.

for instance, on the URL above, there will be a list of sellers. the "Seller: " text will appear, and directly after it, a hyperlinked text with the name of the seller. thus, within corresponding HTML elements in the HTML source code of that page, various seller names will appear that must be extracted. here are the elements i wish to extract:

[1] the title of the item. this only appears at the top of the page. items 2-4 appear multiple times.

[2] the price in the first column (without the currency symbol)

[3] the item condition

[4] the seller name that appears directly after the "Seller: " text.

these elements are contained within their respective HTML elements in the source. the program must parse the HTML code returned for these elements.

OUTPUT: i want the 4 extracted variables to be output in a text file in TSV (tab separated values) format. the first line of the TSV file is the exception since there is only one title in the URL output.

from the 2nd line onward, each line must contain one price, one condition, and one seller name separated by tab - lines must end with a line break/enter and the next entry on the next line etc.

can this be done in a 30min gig?

regards,

Taidot: Windows Desktop

Näytä lisää: top html 5, top gig, input html 5, html program code, html 5 elements, html 5 code, first code, c++ parse html 5, code top, code line, 5 gig, html on line, windows net application, parse html, net c#net c#, html parser, html code, gp-, code name one, currency source code

Tietoa työnantajasta:
( 5 arvostelua ) Pretoria, South Africa

Projektin tunnus: #2754751