Skip to main content

Table Outliner Model Service

Overview

TableOutliner service is designed to detect and extract tables from document images. The model processes each page of a document to identify the boundaries of tables using vertical and horizontal line detection. It groups these lines into rows and columns, matches them to form table structures, and filters out any extraneous lines that don't align with the detected table boundaries.

Installation

The required dependencies can be installed using pip along with the FURY_AUTH authentication token:

FURY_AUTH=${FURY_AUTH} pip install --use-deprecated=legacy-resolver -r requirements.txt

Running the Service Locally

Launching the Server:

Start the server by running:

python app/server.py

[Optional] Monitor Service Progress in Real-Time via WebSocket [Optional]

  • Set the request type to WebSocket
  • Enter the WebSocket URL: ws://localhost:8000/table-outliner/ws
  • Click Connect

Uploading PDF File to Process:

  • Open Postman
  • Set the request type to http
  • Set the method to POST
  • Use the URL http://localhost:8000/table-outliner
  • Add a key with the type File and name it file
  • Select the PDF file from your computer that you wish to upload
  • Click Send to upload the file and start the Table Detection Outliner Creation process

Viewing the Output:

  • Response Format: The response will appear in JSON with the following format:
{
"result": "formatted tables",
"filename": "uploaded_file.pdf"
}
  • WebSocket Updates: If connected via WebSocket, you'll receive real-time updates on the process of Table Detection Outliner.