
Suljettu
Julkaistu
I need a self-hosted search engine that can crawl and index a mix of PDF, Word, and plain-text files, then return accurate full-text results almost instantly. Once the index is built, users must be able to: • Enter a query and see matches ranked by relevance by default, with an optional toggle to sort by date. • Narrow results by file type so they can quickly focus on just PDFs, DOCX files, or TXT notes. A lightweight web interface or a small REST API is fine—whichever you feel will get the fastest, most reliable response times. I am comfortable provisioning a Linux server, so feel free to lean on Elasticsearch, Apache Lucene/Solr, or another open-source stack you trust; just outline why you picked it and any helper libraries (for example, Tika for document parsing) in your proposal. Deliverables 1. Source code and setup script/container so I can deploy with a single command. 2. Clear README covering prerequisites, indexing instructions, and how to enable the sort/filter controls. 3. A brief test dataset plus test cases that demonstrate searches, date sorting, relevance scoring, and file-type filtering. A Document Search Engine is a system that indexes and searches unstructured and semi-structured documents such as PDFs, Word files, text files, and scanned documents, allowing users to quickly find relevant information. If the first pass runs smoothly, I may extend the project for OCR support and user-level permissions later on.
Projektin tunnus (ID): 40261361
37 ehdotukset
Etäprojekti
Aktiivinen 11 tuntia sitten
Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista
37 freelancerit tarjoavat keskimäärin ₹2 747 INR/tunti tätä projektia

Hello, I can build a fast, self-hosted full-text document search engine using Elasticsearch + Apache Tika (for robust PDF/DOCX/TXT parsing). This stack is production-proven, highly performant, and ideal for relevance scoring with optional date sorting and file-type filtering. What you’ll get: • Dockerized setup (single command deploy) • Automatic crawler + indexer for PDFs, DOCX, TXT • REST API + lightweight web UI • Relevance ranking (BM25) + date sort toggle • File-type filters (PDF/DOCX/TXT) • Sample dataset + test queries • Clear README with indexing + configuration steps Designed for instant query response and easy extension (OCR via Tesseract, RBAC later). I’m ready to start and can deliver a solid MVP quickly.
₹2 500 INR 40 päivässä
6,8
6,8

As an accomplished backend developer with expertise in MySQL and web development, I believe I am an excellent candidate for your project. Building upon a strong foundation of over five years' experience, I have honed my skills specifically in designing, developing, and implementing database structures- a perfect match for your needs. My proficiencies extend beyond the fundamental backend development, and into the realm of complex logic and problem-solving; attributes that will prove invaluable for your full-text document search engine. My understanding of Laravel framework combined with MySQL querying gives me an edge in optimizing code and ensuring speedy data retrieval which is paramount for this kind of project. I always prioritize using best practices to write efficient and reliable code while delivering high-quality work to assure client satisfaction. The final product that I'll provide includes the source code, setup script or container for straightforward deployment so that you can get started with a single command. I understand the importance of clear documentation which is why your README file will be concise yet comprehensive providing setup instructions and indexing details alongside sorting/filtering controls. Let me help you build a system that delivers outstanding search results on unstructured and semi-structured documents. Contact me now and let's embark on this project together!
₹2 500 INR 40 päivässä
4,9
4,9

Hi,Full-Text Document Search Engine "", We provide complete frontend to backend development with clean, scalable, and high-performance solutions tailored to your requirements. Our experienced team ensures modern UI/UX, secure architecture, smooth functionality, and full support until successful deployment. Let’s build a reliable and impactful product together. Regards, Muhammad Abdullah
₹2 500 INR 40 päivässä
4,6
4,6

I can build a self-hosted document search engine optimized for speed, accuracy, and easy deployment using Elasticsearch as the core indexing engine and Apache Tika for reliable content extraction from PDF, DOCX, and TXT files. Elasticsearch is the best fit because it provides extremely fast full-text search, built-in relevance scoring (BM25), efficient date sorting, and flexible filtering by file type with minimal overhead. Apache Tika will ensure consistent parsing and metadata extraction (including file type and modification date), while a lightweight REST API (Python FastAPI or Node.js) and a simple web interface will allow users to submit queries, toggle relevance/date sorting, and filter results instantly. This architecture is proven, scalable, and performs efficiently even on modest Linux servers. I will deliver a fully reproducible setup using Docker so you can deploy the entire system with a single command, along with complete source code, setup scripts, and a clear README explaining indexing, search usage, and how to enable sorting and filtering features. I will also include a small test dataset and documented test cases demonstrating relevance ranking, date sorting, and file-type filtering so you can validate performance immediately. The system will be designed with future extensibility in mind, making it straightforward to add OCR support (via Tesseract + Tika) and user-level access controls later without requiring architectural changes.
₹2 500 INR 40 päivässä
4,1
4,1

Hey, I noticed your project, Full-Text Document Search Engine and believe I can help. My work in Java has prepared me well for this kind of project. Looking forward to hearing your thoughts.
₹2 500 INR 7 päivässä
4,6
4,6

Leveraging on my vast skill set and extensive experience in Python, REST API, and Web Development, I, at PaperPerfect, will create a highly efficient, self-hosted search engine that perfectly aligns with your project goals. I have had ample opportunities working with similar technologies like Elasticsearch and Apache Lucene/Solr, which you mentioned and which I believe remains the most reliable in order to give you the fastest response times possible. Moreover, your direction towards a user-friendly interface resonates perfectly with my development philosophy. I assure you an interface that is not only lightweight but also intuitive; one that readily caters to your specific needs as well as those of your users. In choosing PaperPerfect, you are selecting a team fueled by dedication and professionalism. I will organize and streamline the source code/setup script/container so you can easily deploy the code with a single command – thereby minimizing post-development hassles as much as possible. My commitment doesn’t waver at delivering just codes - instead I ensure that our README includes easy-to-follow instructions covering prerequisites, indexing procedures plus how to enable sort/filter controls. Additionally,
₹2 500 INR 40 päivässä
3,3
3,3

Hi, there. I’m very interested in building your self-hosted document search engine for PDFs, Word files, and plain-text documents with near-instant full-text results. I recommend using Elasticsearch as the core search engine because it’s fast, scalable, and handles complex queries with relevance scoring, while Apache Tika can extract text reliably from PDFs and DOCX files. The system will allow users to enter a query, see results ranked by relevance, optionally sort by date, and filter by file type (PDF, DOCX, TXT). I can provide a lightweight web interface or REST API for querying, whichever gives the fastest, most reliable response times on your Linux server. Deliverables include source code with a setup script or container for single-command deployment, a clear README with prerequisites and indexing instructions, and a small test dataset with sample queries demonstrating relevance ranking, sorting, and file-type filtering. The architecture will be modular so you can later add OCR support and user-level permissions without rewriting the core indexing logic. My goal is a production-ready system that makes searching large document collections fast, reliable, and extensible for future features. I hope to hear from you. Thank you.
₹2 500 INR 16 päivässä
3,5
3,5

Hello Just read your post and it seems you are looking for a developer skilled in building a self-hosted full-text search engine that can crawl and index PDFs, DOCX, and TXT files, then return fast, relevance-ranked results with optional date sorting and file-type filtering. With my years of extensive experience and exceptional expertise in designing search pipelines using Elasticsearch/OpenSearch or Solr/Lucene, document parsing with Apache Tika, fast query APIs, and Dockerized deployments on Linux, I am 100% confident that I can bring your vision to life in the shortest possible time. I can deliver a single-command deploy (Docker/Compose), an indexing crawler, and a lightweight web UI or REST API that supports relevance ranking by default, a date-sort toggle, and filters for PDF/DOCX/TXT—along with a clear README and test dataset/test cases to prove search, scoring, sorting, and filtering behavior. Let’s connect and see how great value I can add to your business. Best Regards, Raka
₹2 500 INR 40 päivässä
3,3
3,3

Drawing from my strong proficiency in Python, particularly Django and Flask, I am confident that I have the skills necessary to create a powerful and user-intuitive search engine for all your document needs. Over the years, I have been consistently delivering scalable, efficient, and reliable web applications that are always aligned with users’ needs. Furthermore, my extensive experience with web development has given me a deep understanding of both front-end and back-end requirements, making me particularly suitable for a project of this complexity. I will leverage Django's robust framework to build a well-structured web interface or REST API that guarantees lightning-fast response times. Given your comfort with provisioning a Linux server, we can deploy the project seamlessly with just one command. Moreover, I am no stranger to open-source stacks like Elasticsearch, Apache Lucene/Solr which will be essential in designing a high-performing indexing system for your diverse mix of documents. My approach to delivering projects involves concise documentation for easy future reference; you are assured of getting thorough README files covering prerequisites, setup instructions, and even some test cases to showcase the system’s capabilities
₹2 500 INR 40 päivässä
2,9
2,9

You’re looking to build a self-hosted search engine that indexes PDFs, Word, and plain-text files with fast, accurate full-text search, including relevance ranking and filtering by file type and date. I understand you need a lightweight web interface or REST API with easy deployment on a Linux server, plus clear documentation and test cases. With over 15 years of experience and 200+ projects, I specialize in Python, Linux, MySQL, Elasticsearch, REST API development, and containerization using Docker. These skills align perfectly with your need for a robust, scalable search solution that handles document parsing and indexing efficiently. I plan to use Elasticsearch combined with Apache Tika for document parsing to ensure reliable extraction from diverse file types. The backend will expose a REST API with endpoints for querying, sorting, and filtering, while deployment will be containerized with a setup script for one-command installation. I estimate a functional prototype within two weeks, including test data and comprehensive documentation. Let’s discuss how I can tailor this solution to your exact needs and get your search engine up and running smoothly.
₹2 750 INR 7 päivässä
2,4
2,4

Hello, I can build a fast, self-hosted document search engine that indexes PDF, DOCX, and TXT files and delivers near-instant full-text results with relevance ranking, date sorting, and file-type filtering. Proposed stack (proven, scalable, and fully open-source): • Elasticsearch – high-performance indexing and relevance scoring, ideal for instant search across large document sets • Apache Tika – reliable parsing and text extraction from PDF, Word, and text files • Lightweight REST API (Python or .NET) – handles indexing control, filtering, and search requests • Simple web UI (optional) – clean interface for query, sort toggle (relevance/date), and file-type filters Why this approach: Elasticsearch provides extremely fast query performance, built-in relevance ranking (BM25), and native filtering/sorting capabilities. Tika ensures accurate content extraction without custom parsers. This combination is widely used in enterprise-grade document search systems. The system will be designed so future features like OCR, access control, or incremental indexing can be added easily. Estimated timeframe: 3–5 days for full working deployment and documentation. Best regards
₹2 500 INR 40 päivässä
2,1
2,1

I’m confident enough in delivering this that I’m happy to outline the full architecture and indexing flow in chat before you award the bid. I’d build this using Elasticsearch + Apache Tika: Tika for extracting text + metadata from PDF, DOCX, and TXT Elasticsearch for fast inverted indexing, relevance scoring (BM25), and date sorting Lightweight FastAPI REST layer (or minimal web UI) for instant queries File-type filtering via indexed metadata field Optional Docker setup for one-command deployment You’ll receive: Containerized setup (docker-compose) Clean source code + indexing scripts README with full deployment + usage guide Test dataset + test cases proving relevance ranking, date sorting, and file-type filtering The system will be optimized for near-instant query responses after indexing. If this aligns with what you want, reach out via chat and let’s confirm dataset size and expected document volume before I begin.
₹2 500 INR 40 päivässä
1,6
1,6

Hi there. I have development experience in Java, Apache Solr, Spring. CV in private message if needed. can you provide examples of documents to be indexed? Any UI requerements?
₹2 500 INR 40 päivässä
1,7
1,7

Having a profound knowledge in Java, Python and Web Development, I vow to deliver an exceptional performance for your Full-Text Document Search Engine. My experience specifically with project management and business development allows me to construct a system that indexes and searches various file types including the unstructured ones such as PDFs, Word Files and text files. My skills resonate well with your project description which requires index building as well as searching capability. Furthermore, my proficiency in web development can ensure a lightweight web interface or a REST API that will facilitate fast and reliable response times for returning accurate full-text results. With respect to your comfort of provisioning a Linux server, I can utilize my expertise with open-source stacks like Elasticsearch, Apache Lucene/Solr or any other stack you find suitable, outlining clear instructions on how to enable sort/filter controls. Finally, since this is not only about development but also about documentation and knowledge-sharing, my experience with professional documentation can guarantee you a clear README covering prerequisites and indexing instructions. My past clients praise my clear communication skills, focused approach and most importantly, my ability to deliver results that actually help their business grow - I look forward to bringing those same qualities to your project.
₹2 500 INR 40 päivässä
0,0
0,0

Hello, I have strong experience in building search-based and AI-driven systems using Python, Linux, and modern backend technologies. I can develop a high-performance self-hosted document search engine that crawls and indexes PDFs, DOCX, and TXT files with near-instant full-text search results. ? Proposed Tech Stack: Elasticsearch for fast indexing & relevance-based ranking Apache Tika for PDF/Word text extraction Python (FastAPI) for lightweight REST API Docker for one-command deployment Linux-based optimized setup ? Features I Will Deliver: ✔ Relevance-based search with optional date sorting ✔ File-type filtering (PDF, DOCX, TXT) ✔ Clean REST API or minimal web UI (as preferred) ✔ Containerized deployment (single command setup) ✔ Proper README with indexing & filter instructions ✔ Test dataset + test cases for scoring and filtering validation I have 1.5+ years of experience building AI chatbots, ML models, and Flask-based applications, so I understand document parsing, backend systems, and performance optimization well. If required later, I can also extend this system with OCR support and role-based user permissions. Looking forward to building a fast and reliable search solution for you. Best regards, Shahnawaz Hussain
₹2 500 INR 40 päivässä
0,0
0,0

A search engine should not just find documents — it should remove friction from decision-making. When teams can retrieve the exact paragraph they need in milliseconds, productivity compounds. Your goal is clear: build a fast, self-hosted document search system that indexes PDFs, DOCX, and TXT files, ranks results by relevance, allows date sorting, and filters by file type — all with reliable performance and clean deployment. This isn’t just about indexing files; it’s about building a scalable retrieval foundation that can later support OCR and permission layers. I specialize in architecting high-performance search systems using Elasticsearch with structured mappings, Apache Tika parsing pipelines, and lightweight REST layers for instant querying. My approach prioritizes: Clean indexing architecture Deterministic relevance tuning Filtered query optimization Containerized one-command deployment I design systems with future extensibility in mind — OCR integration, user-level access control, and scaling to large document sets without refactoring the core. You won’t just receive working code — you’ll receive a structured, production-ready search foundation built for growth. Scalable Architecture Precision Indexing Instant Retrieval Best regards, Andrew
₹2 500 INR 40 päivässä
0,0
0,0

Hello, I can build a self-hosted, high-speed document search engine that crawls and indexes PDF, DOCX, and TXT files, delivering near-instant full-text search with relevance ranking, date sorting, and file-type filtering. I have strong backend and systems experience and will design this to be clean and scalable
₹2 500 INR 40 päivässä
0,0
0,0

Hi! I'm a full-stack developer specializing in Python, Linux, Docker, and search infrastructure - this is exactly the kind of project I enjoy building. My proposed stack: - Elasticsearch (preferred for speed + REST API out-of-the-box) containerized with Docker - Apache Tika for parsing PDF, DOCX, and TXT files with accurate text extraction - Python indexing pipeline: crawl directory → parse with Tika → index into Elasticsearch - Lightweight REST API (FastAPI) or simple web UI for search queries - Features: relevance ranking (default), date sorting toggle, file-type filtering (PDF/DOCX/TXT) - Single-command deployment via Docker Compose Deliverables: 1. Full source code + Docker Compose setup (single command deploy) 2. Clear README: prerequisites, indexing instructions, filter/sort controls 3. Test dataset + test cases covering relevance scoring, date sorting, and file-type filtering Bonus: I can add OCR support via Tesseract in the same pipeline if you want to extend later. Why Elasticsearch over Solr: better REST API, faster setup, and excellent Python client support. Ready to start immediately - can I confirm the expected document volume and server specs?
₹2 500 INR 40 päivässä
0,0
0,0

Hello. Please see my portfolio: https://www.freelancer.com/u/felipeg207 After studying your proposal, I'm confident I'm a great developer for your project and I am very interested in your details. I have experience completing similar projects from start to finish. I'd like to discuss the project in more detail. I look forward to your response. Thank you.
₹2 500 INR 40 päivässä
0,0
0,0

As an experienced Java programmer with 10+ years in the field, I have tackled and successfully delivered projects of similar magnitude and complexity. Over time, I've honed my skills in document parsing and search engine development using open-source stacks like Elasticsearch and Apache Lucene; perfect for your need of a lightweight web interface or a REST API. My knowledge of Tika will further streamline the process by facilitating smooth conversion of PDFs, Word files, and text files into a common format for comprehensive indexing. I am particularly proud of my efficiency in system design and my capacity to deliver clean, well-structured codes that are easy to understand, deploy, and maintain. A testament to this is my 100% delivery record. You won't need to worry about compatibility issues either - I am proficient with multiple platforms including Linux which is essential in your case. This means you'll receive not just a bullet-proof search engine, but also a single command deployment solution with crystal-clear instructions.
₹3 500 INR 40 päivässä
0,0
0,0

Faridabad, India
Maksutapa vahvistettu
Liittynyt maalisk. 4, 2022
£250-750 GBP
£750-1500 GBP
$1500-3000 AUD
₹12500-37500 INR
€30-250 EUR
$15-25 USD/ tunnissa
₹150000-250000 INR
$10-30 USD
₹20000-25000 INR
$30-250 USD
$25-50 USD/ tunnissa
$30-250 USD
$2-8 USD/ tunnissa
$30-250 USD
$250-750 USD
$25-50 USD/ tunnissa
₹12500-37500 INR
£10-20 GBP
$2-8 USD/ tunnissa
min £36 GBP/ tunnissa