Build Database to Store & Extract Data from Text Files (Easy $$)
$10-300 USD
Maksettu toimituksen yhteydessä
**I started this project with another developer who fell ill and had to stop. Some of the notes below are mine and some of the execution methods are his.**
My goal… load several email list into a database and later, extract all of the email addresses that have the same email domain as a URL in my domain list.
Example domain
[login to view URL]
[login to view URL]
[login to view URL]
Example text string
chris,jackson,chris@[login to view URL],1234567897
billy,bob,bbob@[login to view URL],84881451
john,doe@[login to view URL],8814
Example saved results
chris,jackson,chris@[login to view URL],1234567897
john,doe@[login to view URL],8814
I am looking for any line that has an email that matches a domain within my domain list. So if I have [login to view URL] it will extract every line where there is a @[login to view URL] email address. Domains are one per line. In my list I will have [login to view URL] but when it is being searched, you will code it to search for @[login to view URL] to ensure it pulls a valid email format and not ‘email@[login to view URL]’.
IMPORTANT
Once imported into the db there will be around 500 million records. I have a list of 400,000 domains that I want to scan against it. To help speed up the process, I will have already purged domains from my email list that I do NOT want to find a match for (mostly free email accounts).
NOTES
-- The Source files are .csv files that I renamed to .txt for this purpose.
-- Some files have 1 column, and some have 3, 4, etc. Instead of trying to build a table to match the columns, we will treat each row of the CSV/text file as a single text string (as a single column) and import the entire row of data.
-- Each file is not the same so we cannot assume column 1 will always be email. That is why we will import an entire row of data and store it as a single value within the database. After exporting, I can clean up the data.
REQUIREMENTS
-- The results need to save frequently instead of waiting until the task is complete.
-- It must have a basic UI – no commands. There should be three buttons.
1) Import Data (this allows me to select the file I want to import).
2) Load Domains List (This is the list of URL's I want to find emails for).
3) Match Records (Looks through the DB to find emails strings that have a url that makes a domain in the URL list).
-- It is also possible to have just two buttons and when I click Match, it asks me to select the [login to view URL] file. This is the file that has the URL’s I want each email address to include.
-- Speed is very important.
DEVELOPER NOTES
Here are some key points from the previous developer who started it.
-- Use C++ -- with some precomputation and indexing we can save a lot of time
-- your issue looks mainly like an indexing issue, if you'll index emails with domain name you -- won't have to go through all data every time you search
-- I talked with the DB Admin, and we are using nested Queries
-- Assuming that we have an index loaded in memory, it should not take longer than a few minutes at most (I expect).
-- As for the way we match the domains, we could either use a regular expression filtering out only the domain. If that is too slow, then I will store the address as a subset of the domain which should will definitely work.
-- If there are any memory problems, we could use a stream instead of reading in the entire index in one go. This will be slower though, so first I'll try to get the entire index loaded into memory.
In your bid, include the following the quoted text in the first line “I reviewed the notes and understand that speed is important to you.” To help me understand that you are the right person for the job, let me know how soon you can start, when you can finish, and how you plan to develop this. The more detail you provide, the more confident that I am that you are the best choice.
Make your best bid first as I will not be overpaying for this task. Do not bid the maximum budget amount as my max is lower than that.
Thanks!
Projektin tunnus: #18341246
Tietoa projektista
Myönnetty käyttäjälle:
Hello. I have good skills in .NET, C++ Programming, Database Development, Database Programming, MySQL. I have read your project description carefully and i can do it. I hope to work with you. Contact me please. Th Lisää
Hello, I reviewed the details and I understand that speed is important for you. I assume that you will search in millions of rows multiple times. Standard string matching algorithms will be too slow for that purpose. Lisää
19 freelanceria on tarjonnut keskimäärin $180 tähän työhön
Hello? How are you? I have good experiences in "Build Database to Store & Extract Data from Text Files (Easy $$)" as you can see my profile for these (.NET, C++ Programming, Database Development, Database Programming Lisää
I am expert who understands the value of time. I pride myself in my attention to detail. I am very hard working and aim to deliver in less time than quoted. I want to make you, my employer happy without changing my bid Lisää
I reviewed the notes and understand that speed is important to you. Hi, It will be achieved by developing a C# win-forms application with buttons and grid. I have 11 years of experience in professional IT deve Lisää
“I reviewed the notes and understand that speed is important to you.” Hi, I have developed several such applications in the past for different clients, and can do the same job for you. I can develop this for Lisää
I reviewed the notes and understand that speed is important to you Hello, I'm a web and desktop developer with 8 years of experience in Database Administration & Desktop app, I read your project description. I can d Lisää
Hi! My name is Ihor, I will be glad to help you with your project. I specialize in .NET development for 3+ years. Feel free to contact me any time to discuss details.
Hi i am very good at c# and database programming please text me so we can start and will assured quality and in time delivery
hi i read all requirement and please share more detail i did 2 similar task i will provide 5 star rating work thanks
I am confident I am the right candidate for this project as I have done many similar projects in the past. With years of experience in this field, I believe this project will be very easy for me.