We need to crawl few information from 8 sites: (we can pm to you if you request)
We need to crawl the products price from 8 sites.
some of the sites is very simple, as they have product ID in the URL.
e.g. [url removed, login to view]
so, you can do it very easily.
We only need to store
1. list price
2. special price (if any)
3. sku / item code
4. product id (if any)
5. product title
6. product weight (if any)
7. product URL
8. site ID, to indicate which site you store
9. product image (one small, listing image is enough)
10. any other field that you may need to store in order to let your program run, such as last crawl date, flag, hash, etc
All the data is stored in MySQL 5.x db.
We use the data to compare the price (you DO NOT need to write this part, just store the data format as we request is enough),so, no need to store the other information.
but if you think that storing other information, such as category is helpful, you can store it.
we don't mind.
All the product price should be in USD.
The script must be able to run in cron job OR manually.
If the product is crawled before, update the exist record.
It should allow us to input a product URL or product ID range, so that we can crawl some product
manually. it is mainly use to fix some products which is timeout during crawl.
It should allow us to set which site to be crawl in the "web submit form", just like a normal web form.
The project must be written in module, so that we can add extra CRAWL site php file, say the 9th site:
[url removed, login to view] in the module/ folder.
then your script will auto load that file and we need to change nothing in the php files.
the auto crawl setting should be store in db, so, please give us a web form so that we can modify it.
such as when to crawl, which site to crawl
don't request us to do it in phpmyadmin. a normal end user don't know how to use phpmyadmin.
After crawl the product, it is needed to update the mysql LOG so that we can check which NEW product is added and which product price is changed.
We allow this project to be done within 10 days. (included holidays)
Redhat Linux, Apache 2.x, php 5.3.x
We will ecrow 50% of the payment once you are accepted to do this project.
you need to send us the code to test on our server or you can show us in your server.
Once we got your working code (at least 90% work), we will escrow another 50% to you.
The payment will be released to you if any only if after we TESTED and WORKED without problem in our server.
Normally the test takes about 1-2 days, it depends on how fast is your script can crawl.
it don't makes send s for us to test for 1 day if your script need 10 days to crawl, right?
Bonus will be given if your script is fast, clean and well communication.
normally we pay extra 10%-20% depends on your work.
Deadline is super important for us, if we can't finish it in the period you stated.
We have the right to cancel this project and you CANNOT argue on that.
If you DO NOT agree or DO NOT understand what we said, please DO NOT bid this project.
Sorry, the no. of sites is 4, NOT 8
only need to crawl 4 sites.