Make a function that produces a regex pattern to identify URLs of interest

Suppose we are intending to scrape a job portal, [login to view URL], which virtually contains many external sublinks, such as:

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

The idea is to apply a difference checker algorithm which yields a generic regex that matches the above routes, considering variable parts of the URLs, based on whether they yielded jobs or not.

Build a function, generatePattern(routes), where routes is an array of object having:

URL: str

hasYieldedJob: bool

In the above example, all the links except the last 3 ones yielded jobs, so, the perfect (fictive pattern) regex would be:

/job/{any number}/{any string}/?{any string}

Case scenarios

Query parameters should be considered as variables due to their complexity.

We do not want to apply a constant rule upon them, even if in the given dataset of urls they are the same. So if we have “/job/foo?parameter=true”, pattern will be “/job/foo{any string}”. Additional brainstorming is welcome.

- If routes contain hyphens, say ".../foo-bar/...", no matter if the part is invariant within the supplied urls, it will be considered as ".../{any string}/..."

Taidot: Python, Regular Expressions

Näytä lisää: mysql regex pull image urls, code make function search site, make function export database file zip file cake php, make function rbf matlab, php make function create recordset, regex pattern finder, rewritecond nocase option non regex pattern supported, java regex pattern matches, java util regex pattern example, java util regex pattern examples, regex pattern java example, make your own camo pattern online, how to make a repeat textile pattern, how to make function for insert query in php, how to make your own pants pattern, how to make a seamless repeat pattern in photoshop, how to make a digital sewing pattern, how to make a 3d plush pattern, how to make a cross stitch pattern in excel, how to make a stuffed animal pattern

Tietoa työnantajasta:
( 9 arvostelua ) Piazza Armerina, Italy

Projektin tunnus: #29045421

10 freelanceria on tarjonnut keskimäärin $102 tähän työhön


I have experience in python for Regex generator checker for Licene plate checker. Links to some previous projects: Lisää

$140 USD 7 päivässä
(29 arvostelua)

Dear Client Warm Greetings, I have been Python Developer for 3+ years and have experience of Building Management, Distributed, Database Applications. with Machine Learning, Ensemble Learning, Deep Learning implementat Lisää

$111 USD 1 päivässä
(6 arvostelua)

Dear employer, Hi I can develop the code to find the URLs which has yielded job. I read the description carefully and got exactly what you want. I am a computer programmer with more than 10 years of working experienc Lisää

$100 USD 7 päivässä
(9 arvostelua)

Hello Sir, I have previous knowledge and experience with regex. I think I can meet your requirements. Inbox me please so I can help. Thanks

$70 USD 7 päivässä
(6 arvostelua)

NOTE : I HAVE EXPERTISE IN WEB SCRAPING. With respect to this project I would like to present myself as a candidate for your consideration. I have more than 12 years of IT experience. I have successfully completed pro Lisää

$140 USD 4 päivässä
(1 arvostelu)

Hello Python EXPERT I have read your description and I am so interested in your project. You can see well experienced and skillful Python +15 years of experience in software development. Confident in your project and I Lisää

$140 USD 7 päivässä
(5 arvostelua)

Hi, I can build this function using python and will give you the script of course. Ready to start right NOW. I could make a sample script for the presented details here if you wanted.

$60 USD 1 päivässä
(4 arvostelua)

Hello, this is Rahaman. I will build you a pyton function to identify if the link has job or not with regex on the given website website. This job seems interesting to me. I have extensive experience in crawling websit Lisää

$75 USD 2 päivässä
(1 arvostelu)

Hello, I am Individual freelancer. I have pretty much good experience in regular expression re library of python. I am available for this task. and will try to deliver you the script today. Waiting for your kind respon Lisää

$100 USD 2 päivässä
(1 arvostelu)

Hi, I can get you a working version of the function you need straight away. Probably you will want to supply some additional test data, to see if you need it to account for some additional factors not present in the s Lisää

$80 USD 1 päivässä
(0 arvostelua)