We often need to convert PDF to text files for use in data analysis, search indexing, or content repurposing. Also, if you want to build a PDF to text converter, automate OCR online, or copy text from PDF documents for further processing, our REST API provides a reliable and developer-friendly solution.

PDF Conversion API

The Aspose.PDF Cloud SDK for .NET is a cloud-based API that simplifies document text extraction and PDF processing. It intelligently handles both text-based and image-based PDFs, providing accurate and structured output in TXT format.

Key features include:

  • PDF to TXT file extraction with high accuracy.
  • Cross-platform REST API — works seamlessly in C#, .NET Core, or any environment with HTTP support.
  • Support for partial extraction — define regions and extract text from specific areas.

To get started, add the SDK to your .NET project using NuGet:

Install-Package Aspose.PDF-Cloud

Then, visit the Aspose Cloud Dashboard to obtain your Client ID and Client Secret credentials.

PDF to Text Conversion using C# .NET

Let’s look at how to convert a PDF to text file in C# using the .NET REST API.

Step 1. - Create an instance of PdfApi class using client credentials.

PdfApi pdfApi = new PdfApi(clientSecret, clientID);

Step 2. - Read input PDF file and upload to cloud storage.

var sourceFile = File.OpenRead(inputFile);
pdfApi.UploadFile("inputPDF.pdf", sourceFile);

Step 3. - Specify the rectangular region in PDF and extract text using GetText(...) method.

TextRectsResponse response = pdfApi.GetText("inputPDF.pdf", LLX, LLY, URX, URY, null, null, null, null, null);

Step 4. - Iterate through List containing text occurrences and save it to local drive.

foreach (var textFragment in response.TextOccurrences.List)
{
    output.WriteLine(textFragment.Text);
}

Convert PDF to TXT File using cURL

For developers who prefer a scripting or cross-platform workflow, the Aspose.PDF Cloud REST API can also be accessed using cURL commands.

Step 1. – Generate an Access Token:

curl -v "https://api.aspose.cloud/connect/token" \
 -X POST \
 -d "grant_type=client_credentials&client_id=XXXXXXX-XXXXXX-ff5c3a6aa4a2&client_secret=XXXXXXXXXXX" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -H "Accept: application/json"

Step 2. – Extract Text from PDF Once the JWT token has been generated, please execute the following command to pull the text from the PDF document.

curl -v "https://api.aspose.cloud/v3.0/pdf/{inputPDF}/text?splitRects=true&LLX=0&LLY=0&URX=800&URY=800" \
-X GET \
-H  "accept: application/json" \
-H  "authorization: Bearer {ACCESS_TOKEN}" \
-o "resultant.txt"

This cURL command retrieves textual content from your PDF file and stores it in a .txt file, making it an ideal method for PDF to text file conversion in automated environments.

Try Free PDF to Text Converter

Are you looking for PDF to TXT conversion without coding? Try our Free Online PDF to Text Converter — powered by Aspose.PDF Cloud. Simply upload your PDF and download the extracted text file in seconds.

extract text from PDF

Conclusion

In this article, we have learned that the conversion of PDF to text is essential for extracting and reusing information efficiently. With Aspose.PDF Cloud, you can automate the process of copying text from PDFs, handling scanned files using OCR online, and exporting data as structured text for analytics or search indexing.

Frequently Asked Questions (FAQs)

  1. Can I copy text from PDF programmatically? Absolutely. The API allows you to copy text from PDF files by retrieving all text occurrences or extracting from specific regions using coordinates.

  2. What’s the difference between PDF to text and text to PDF? PDF to text extracts textual data from documents, while text to PDF creates a new PDF document from plain text input. Aspose.PDF Cloud supports both operations.

  3. Do I need Adobe Acrobat installed? No. The Aspose.PDF Cloud SDK operates independently of Adobe Acrobat or any other software. All PDF to text converter operations occur in the cloud.

  4. Is the extracted text accurate for complex layouts? Yes. The API can accurately extract text from multi-column layouts, tables, and mixed content PDFs, maintaining a clean and readable structure in the resulting TXT file.

We highly recommend visiting the following blogs: