pdf

polar-bookshelf

23 May 2020Last Commit3816 (1920/yr)Github Stars976Issues

Polar is personal knowledge repository which supports advanced features like incremental reading, annotation, comments, and spaced repetition. It supports reading PDF and the web content and was created using the Electron framework and PDF.js

PDF support We have first-class PDF support thanks to PDF.js. PDFs work well when reading content in book format or when reading scientific research which is often stored as PDF.

Captured Web Pages Download HTML content and save them as offline documents which can be annotated.

ambar

28 Apr 2020Last Commit1459 (418/yr)Github Stars3Issues

Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search.

Ambar defines a new way to implement full-text document search into your workflow.

Tutorial: Mastering Ambar Search Queries

Ambar 2.0 only supports local fs crawling, if you need to crawl an SMB share of an FTP location - just mount it using standard linux tools. Crawling is automatic, no schedule is needed due to crawlers monitor file system events and automatically process new, changed and removed files.

ledgersmb

24 May 2020Last Commit180 (29/yr)Github Stars291Issues

Small and Medium business accounting and ERP

LedgerSMB is a free integrated web application accounting system, featuring double entry accounting, budgeting, invoicing, quotations, projects, timecards, inventory management, shipping and more ...

The UI allows world-wide accessibility; with its data stored in the enterprise-strength PostgreSQL open source database system, the system is known to operate smoothly for businesses with thousands of transactions per week. Screens and customer visible output are defined in templates, allowing easy and fast customization. Supported output formats are PDF, CSV, HTML, ODF and more.

shaark

28 Apr 2020Last Commit143 (190/yr)Github Stars12Issues

Shaark is a self-hosted platform to keep and share your content: web links, posts, passwords and pictures.

All of your data can be private, public or both and can be browsed by tags or all-in-one search.

Shaark is production ready, inspired by Shaarli, built with Laravel and Vue.js.

Features / Demo / Documentation / Contribute / Security / Tests / Licence

A public demo is available at https://shaark.mka.ovh. Credentials are admin@example.com and secret. This demo is resetted hourly.

All contributions are welcome! Please use the dev branch for your pull requests.
If you make changes to JS, don't compile assets in production, I'll manually compile them when merging for security reasons.

papermerge

22 May 2020Last Commit103 (271/yr)Github Stars3Issues

In a nutshell, Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS.

Papermerge DMS on its turn will OCR the document and index it. You will be able to quickly find any (scanned!) document using full text search capabilities.

You can try it with just 3 simple commands (you need git and docker-compose):

bepasty-server

30 Jul 2019Last Commit74 (12/yr)Github Stars29Issues

bepasty is like a pastebin for all kinds of files (text, image, audio, video, documents, ..., binary).

The documentation is there: https://bepasty-server.readthedocs.org/en/latest/

docspell

23 May 2020Last Commit47 (55/yr)Github Stars4Issues

Docspell is a personal document organizer. You'll need a scanner to convert your papers into PDF files. Docspell can then assist in organizing the resulting mess 😉.

You can associate tags, set correspondends, what a document is concerned with, a name, a date and some more. If your documents are associated with this meta data, you should be able to quickly find them later using the search feature. But adding this manually to each document is a tedious task. What if most of it could be done automatically?