Käynnissä

Parsing messy HTML using PERL/REGEX

This project consists of writing a perl script which would allow to parse a large number of messy html files with non-standard regular expressions. I have attached a representative html file to give you an idea. I would like to transform the content of the html file into a comma separated file of the following form:

Category 1; Subcategory A; Text of paragraph 1 in Subcategory A

Category 1; Subcategory A; Text of paragraph 2 in Subcategory A

Category 1; Subcategory A; Text of paragraph 3 in Subcategory A

Category 1; Subcategory B; Text of paragraph 1 in Subcategory B

Category 1; Subcategory B; Text of paragraph 2 in Subcategory B

Category 2; Subcategory C; Text of paragraph 2 in Subcategory C

Category 2; Subcategory C; Text of paragraph 2 in Subcategory C

Category 2; Subcategory D; Text of paragraph 1 in Subcategory C

In total there are in between 1 and 12 categories per html page. The number of categories varies for each html page. Furthermore, the amount of subcategories and corresponding paragraphs differs. I have played around with Perl and I have identified some useful regular expressions. However, I am not a programmer and I just don't have the time at the moment to learn perl and code it myself. I think it is a relatively straightforward task for somebody knowing how to program in perl.

Taidot: tietojenkäsittely, Perl

Näytä lisää: perl parsing html, perl regex html, parsing html perl, perl parse html, parse messy html, parse html file using perl regular expressions parsing, perl parse html file, perl html parsing regular expressions, perl regex html parsing, perl regex html parser, writing html code, writing expressions, using regular expressions, using regex in c, using in writing, using expressions, script writing program, regular expressions in c, regular expressions c, regex is, regex in c, regex c, html program code, how to learn programmer, how to learn content writing

About the Employer:
( 1 review ) London, United Kingdom

Projektin tunnus: #419157

Myönnetty käyttäjälle:

zeke

I am experienced Perl programmer. Ready to start right now and finish within several hours. My bid is for fast professional job. Please contact in PMB if you have any questions. Best Regards, Zeke

50 $ USD 0 päivässä
(26 arvostelua)
5.0

12 freelanceria on tarjonnut keskimäärin 68 $ tähän työhön

gangabass

I can do this job for you. See PM for details.

30 $ USD 1 päivässä
(174 arvostelua)
6.1
Mindon

Check the Pm pls

50 $ USD 2 päivässä
(58 arvostelua)
6.1
alexander2007

Pleaes check PM. Thanks.

60 $ USD 2 päivässä
(21 arvostelua)
5.7
sureshdevi

I can do this work. Thanks, Suresh

60 $ USD 2 päivässä
(31 arvostelua)
4.9
perldev123

I have a working prototype that does the parsing on your sample and most of the output formatting ready now. I would expect to be able to provide a completed program in a few hours.

45 $ USD 1 päivässä
(1 arvostelu)
1.3
whoisbp

Looks like an interesting project; see PM.

100 $ USD 5 päivässä
(0 arvostelua)
0.0
alexgav

I can do it! I have 8 years of experience in PERL.

70 $ USD 2 päivässä
(0 arvostelua)
0.0
nickProfessional

Hi, I am an expert in Perl, and have written several html parsing scripts in Perl that power some nice websites. I can do this job easily and start right now. Kindly revert back to take this forward. Nick

35 $ USD 1 päivässä
(1 arvostelu)
0.0
anoop406

Experienced Perl programmer - scraped lot of sites before

150 $ USD 7 päivässä
(0 arvostelua)
0.0
esagjag

I can fix this task. Warm Regards Sagar

100 $ USD 1 päivässä
(0 arvostelua)
0.0
AlexeiBo

Hi! I'm ready to it. The delivery date: about 3 days with test and fixing.

70 $ USD 3 päivässä
(0 arvostelua)
0.0