cire rubutun pdf

Cire rubutu daga PDF ta amfani da NET REST API.

Takardun PDF sun zama ma’auni don rabawa da musayar bayanai a kan dandamali da na’urori daban-daban. Yayin da PDFs ke ba da tsari mai aminci da daidaito, fitar da mahimman bayanai daga waɗannan takaddun na iya zama ɗawainiya mai ban tsoro, musamman lokacin da ake mu’amala da manyan kundin bayanai. Ko kuna buƙatar fitar da rubutu don bincike, shigarwar bayanai, ko sarrafa abun ciki, ingantaccen ingantaccen cirewar rubutu yana da mahimmanci. A cikin wannan labarin, mun shiga cikin duniyar ciro rubutu daga fayilolin PDF ta amfani da .NET REST API, mai ƙarfi ta Aspose.PDF Cloud SDK.

API ɗin REST don Gudanar da PDF

Aspose.PDF Cloud SDK don NET API ne mai ƙarfi kuma mai sauƙin amfani wanda ke sauƙaƙe cire rubutu daga PDFs. Ɗaya daga cikin fitattun fasalulluka na Aspose.PDF Cloud SDK don NET shine ikonsa na sarrafa hadadden tsarin PDF da kuma fitar da daidaitaccen rubutu daga takardu tare da shimfidu daban-daban. Ko PDF ɗin ya ƙunshi rubutu, hotuna, teburi, ko wasu hadaddun abubuwa, API ɗin na iya kewaya cikin daftarin cikin basira da kuma dawo da abun ciki na rubutu daidai. Don haka, fasalulluka masu ƙarfi, daidaito, da sauƙi na haɗin kai sun sa ya zama kyakkyawan zaɓi don fitar da mahimman bayanai na rubutu daga takaddun PDF a cikin aikace-aikacen NET su.

Yanzu, don farawa da wannan fasalin, mataki na farko shine ƙara ambaton Cloud SDK a cikin maganin NET ɗin mu. Don haka, bincika ‘Aspose.PDF-Cloud’ a cikin manajan fakitin NuGet kuma danna maɓallin ‘Ƙara Kunshin’. Na biyu, ziyarci allon dashboard kuma sami keɓaɓɓen shaidar abokin ciniki.

Cire Rubutun PDF ta amfani da C# .NET

A cikin wannan sashe, za mu bincika cikakkun bayanai don fitar da rubutu daga cikin shirin PDF.

// Don cikakkun misalai da fayilolin bayanai, da fatan za a je zuwa 
https://github.com/aspose-pdf-cloud/aspose-pdf-cloud-dotnet

// Samu bayanan abokin ciniki daga https://dashboard.aspose.cloud/
string clientSecret = "4d84d5f6584160cbd91dba1fe145db14";
string clientID = "bb959721-5780-4be6-be35-ff5c3a6aa4a2";

// ƙirƙirar misali na PdfApi
PdfApi pdfApi = new PdfApi(clientSecret, clientID);

// Shigar da sunan fayil na PDF
String inputFile = "Binder1-1.pdf";
// Karanta abubuwan da ke cikin fayil ɗin PDF cikin misalin rafi
var sourceFile = System.IO.File.OpenRead(inputFile);

// Loda fayil ɗin PDF zuwa ma'ajiyar gajimare
pdfApi.UploadFile("inputPDF.pdf", sourceFile);

// Haɗin X na ƙananan - kusurwar hagu
Double LLX = 500.0;
// Y - daidaitawar kusurwar ƙasa-hagu.
Double LLY = 500.0;
// X - daidaitawar kusurwar sama-dama.
Double URX = 800.0;
// Y - daidaitawar kusurwar sama-dama.
Double URY = 800.0;

// Kira API don cire rubutu daga wasu haɗin kai akan takaddun PDF
TextRectsResponse response = pdfApi.GetText("inputPDF.pdf", LLX, LLY, URX, URY, null, null, null, null, null);

// Rarraba ta hanyar Farkon Rubutun mutum ɗaya
for (int counter = 0; counter <= response.TextOccurrences.List.Count - 1; counter++)
{
    // rubuta abun ciki na rubutu a cikin na'ura mai kwakwalwa
    Console.WriteLine(response.TextOccurrences.List[counter].Text);
}
Cire rubutun pdf

Duban rubutu da aka ja daga takaddar PDF.

An ba da cikakkun bayanai game da snippet code na sama.

PdfApi pdfApi = new PdfApi(clientSecret, clientID);

Da farko, ƙirƙiri misali na ajin PdfApi inda muke ƙaddamar da bayanan abokin ciniki azaman muhawara.

String inputFile = "Binder1-1.pdf";
var sourceFile = System.IO.File.OpenRead(inputFile);

Load da abun ciki na shigar da fayil PDF don yawo misali.

pdfApi.UploadFile("inputPDF.pdf", sourceFile);

Loda daftarin aiki na PDF zuwa ma’ajiyar gajimare.

TextRectsResponse response = pdfApi.GetText("inputPDF.pdf", LLX, LLY, URX, URY, null, null, null, null, null);

Kira API don cire rubutu daga fayil ɗin PDF a wasu daidaitawar shafi.

for (int counter = 0; counter <= response.TextOccurrences.List.Count - 1; counter++)
{
    // write text content in console
    Console.WriteLine(response.TextOccurrences.List[counter].Text);
}

Yi lissafin abubuwan da ke ɗauke da abubuwan da aka fitar da rubutu kuma buga misalin rubutu a cikin na’ura mai kwakwalwa.

Fasa Rubutu daga PDF ta amfani da Umarnin CURL

Yin amfani da umarnin cURL a hade tare da Aspose.PDF Cloud API, zaku iya fitar da abun ciki na rubutu ba tare da wahala ba daga fayilolin PDF da aka shirya akan ma’ajiyar girgije. API ɗin yana goyan bayan sigogi iri-iri don tsara tsarin cirewa, yana ba ku damar ƙididdige haɗin kai, da sauran zaɓuɓɓuka don cire rubutu tare da daidaito.

Mataki na farko tare da wannan hanyar shine samar da alamar samun damar JWT yayin aiwatar da umarni mai zuwa.

curl -v "https://api.aspose.cloud/connect/token" \
 -X POST \
 -d "grant_type=client_credentials&client_id=bb959721-5780-4be6-be35-ff5c3a6aa4a2&client_secret=4d84d5f6584160cbd91dba1fe145db14" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -H "Accept: application/json"

Da zarar an ƙirƙiri alamar JWT, da fatan za a aiwatar da umarni na gaba don cire rubutun daga takaddar PDF.

curl -v "https://api.aspose.cloud/v3.0/pdf/{inputPDF}/text?splitRects=true&LLX=10&LLY=10&URX=800&URY=800" \
-X GET \
-H  "accept: application/json" \
-H  "authorization: Bearer {accessToken}" \
-o "extractedContent.txt"

Sauya ‘inputPDF’ tare da sunan takaddun PDF da aka riga aka samu a ma’ajiyar gajimare, da ‘accessToken’ tare da alamar JWT da aka samar a sama.

Kammalawa

A ƙarshe, duka Aspose.PDF Cloud SDK don NET da tsarin umarni na cURL suna ba da ingantacciyar mafita mai inganci don cire rubutu daga takaddun PDF. Aspose.PDF Cloud SDK don NET yana ba da cikakkiyar API mai haɓakawa da haɓakawa tare da fa’idodi da yawa, yana mai da shi zaɓi mai ƙarfi don haɗawa da cirewar rubutun PDF cikin aikace-aikacen NET. A gefe guda, tsarin umarni na cURL yana ba da hanya mai sauƙi da dandamali mai zaman kanta don yin hulɗa tare da Aspose.PDF Cloud API, yana mai da shi kyakkyawan zaɓi ga masu haɓaka aiki a cikin yanayi daban-daban da harsunan shirye-shirye.

Hanyoyin haɗi masu amfani

Labarai masu alaka

Muna ba da shawarar ziyartar shafukan yanar gizo masu zuwa: