Table Outliner Model Service
Overview
TableOutliner service is designed to detect and extract tables from document images. The model processes each page of a document to identify the boundaries of tables using vertical and horizontal line detection. It groups these lines into rows and columns, matches them to form table structures, and filters out any extraneous lines that don't align with the detected table boundaries.
Installation
The required dependencies can be installed using pip along with the FURY_AUTH authentication token:
FURY_AUTH=${FURY_AUTH} pip install --use-deprecated=legacy-resolver -r requirements.txt
Running the Service Locally
Launching the Server:
Start the server by running:
python app/server.py
[Optional] Monitor Service Progress in Real-Time via WebSocket [Optional]
- Set the request type to
WebSocket
- Enter the WebSocket URL:
ws://localhost:8000/table-outliner/ws
- Click
Connect
Uploading PDF File to Process:
- Open Postman
- Set the request type to
http
- Set the method to
POST
- Use the URL http://localhost:8000/table-outliner
- Add a key with the type
File
and name itfile
- Select the PDF file from your computer that you wish to upload
- Click
Send
to upload the file and start the Table Detection Outliner Creation process
Viewing the Output:
- Response Format: The response will appear in
JSON
with the following format:
{
"result": "formatted tables",
"filename": "uploaded_file.pdf"
}
- WebSocket Updates: If connected via WebSocket, you'll receive real-time updates on the process of Table Detection Outliner.