cire rubutu daga PDF

Cire Rubutu daga PDF a Java

Dukanmu mun san cewa PDF fayiloli ɗaya ne daga cikin mafi mahimmanci kuma mafi girman tsarin dijital da ake amfani da su don gabatarwa da musayar takardu cikin dogaro, masu zaman kansu daga software, hardware, ko tsarin aiki. Koyaya, a wasu yanayi, ƙila mu yi sha’awar samun wani yanki daga manyan fayilolin PDF. Ko, ƙila muna da buƙatu don adana PDF zuwa Rubutu akan layi. Don haka a cikin wannan labarin, za mu bincika cikakkun bayanai kan yadda ake haɓaka PDF zuwa mai canza rubutu ta amfani da Java REST API.

API ɗin Generator na PDF

Sami abin amfani don samar da takaddun PDF ta amfani da samfuri ko daga karce ta amfani da REST API ɗin mu. A lokaci guda, API ɗin kuma yana ba ku damar shirya tare da canza fayilolin PDF zuwa wasu tsararrun tallafi masu tallafi. Hakanan zaka iya ɗaukar fa’idodin cire rubutu daga PDF, ɓarnawa da haɗa fayilolin PDF ta amfani da Java Cloud SDK. Yanzu, don amfani da Aspose.PDF Cloud SDK don Java, muna buƙatar ƙara bayaninsa a cikin aikace-aikacen Java ɗinmu ta haɗa da cikakkun bayanai a cikin pom.xml (maven build type project).

<repositories> 
    <repository>
        <id>aspose-cloud</id>
        <name>artifact.aspose-cloud-releases</name>
        <url>http://artifact.aspose.cloud/repo</url>
    </repository>   
</repositories>

<dependencies>
    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-cloud-pdf</artifactId>
        <version>21.11.0</version>
        <scope>compile</scope>
    </dependency>
</dependencies>

Bayan shigarwa, muna buƙatar ƙirƙirar asusun kyauta akan Cloud Dashboard kuma mu sami keɓaɓɓen bayanan abokin ciniki.

Cire Rubutu daga PDF ta amfani da Java

Bari mu bincika cikakkun bayanai don cire rubutu daga PDF ta amfani da Cloud SDK Java. A cikin wannan misalin, za mu yi amfani da shigarwar mai zuwa PdfWithTable.pdf fayil.

pdf zuwa rubutu ocr

Hoto 1: - Fayil ɗin shigarwa don PDF zuwa cirewar rubutu.

pdf zuwa rubutu ocr

Hoto 2: - Cire rubutu daga samfoti na PDF

// don ƙarin misalai, da fatan za a ziyarci https://github.com/aspose-pdf-cloud/aspose-pdf-cloud-java/tree/master/Examples/src/main/java/com/aspose/asposecloudpdf/examples

try
    {
    // Samu ClientID da ClientSecret daga https://dashboard.aspose.cloud/
    String clientId = "bb959721-5780-4be6-be35-ff5c3a6aa4a2";
    String clientSecret = "4d84d5f6584160cbd91dba1fe145db14";
		    
    // ƙirƙirar misali na PdfApi
    PdfApi pdfApi = new PdfApi(clientSecret,clientId);
    // sunan shigar da takaddun PDF
    String name = "PdfWithTable.pdf";
		        
    // karanta abun ciki na shigar da fayil ɗin PDF
    File file = new File(name); 
    // loda PDF zuwa ma'ajiyar gajimare
    pdfApi.uploadFile("input.pdf", file, null);
		        
    // Haɗin X na ƙananan - kusurwar hagu
    Double LLX = 500.0;
    // Y - daidaitawa na ƙananan-kusurwar hagu.
    Double LLY = 500.0;
    // X - daidaitawar kusurwar sama-dama.
    Double URX = 800.0;
    // Y - daidaitawar kusurwar sama-dama.
    Double URY = 800.0;
			       
    // kira API don Maida PDF zuwa Rubutu
    TextRectsResponse response = pdfApi.getText("input.pdf", LLX, LLY, URX, URY, null, null, null, null, null);    
		    
    // Rarraba ta hanyar Farkon Rubutu ɗaya ɗaya
    for(int counter=0; counter <=response.getTextOccurrences().getList().size()-1; counter++)
    {
        // rubuta abun ciki na rubutu a cikin na'ura mai kwakwalwa
        System.out.println(response.getTextOccurrences().getList().get(counter).getText());
    }
		  
    System.out.println("Extract Text from PDF successful !");
    }catch(Exception ex)
    {
        System.out.println(ex);
    }

Yanzu bari mu yi ƙoƙarin fahimtar ƙayyadaddun snippet code na sama:

PdfApi pdfApi = new PdfApi(clientSecret,clientId);

Ƙirƙiri misali na PdfApi yayin ƙaddamar da keɓaɓɓen takaddun shaida azaman muhawara.

File file = new File(name); 
pdfApi.uploadFile("input.pdf", file, null);

Karanta shigarwar PDF ta amfani da abun Fayil kuma loda shi zuwa ma’ajiyar gajimare ta amfani da hanyar uploadFile(…) na ajin PdfAPi. Lura an ɗora fayil ɗin tare da sunan da aka yi amfani da shi a hanyar uploadFile.

TextRectsResponse response = pdfApi.getText("input.pdf", LLX, LLY, URX, URY, null, null, null, null, null);    

Yanzu kira hanyar getText(..) inda muka saka sunan shigar da fayil ɗin PDF, girman murabba’i rectangular akan shafi wanda daga ciki muke buƙatar cire abun ciki na rubutu kuma, mayar da abun ciki da aka ciro zuwa TextRectsResponse abu.

response.getTextOccurrences().getList().get(counter).getText()

A ƙarshe, don buga abun cikin rubutu da aka ciro, za mu sake maimaita duk TextOccurances kuma mu nuna su a cikin na’ura wasan bidiyo.

PDF zuwa Rubutu ta amfani da Umarnin CURL

Baya ga snippet code na Java, muna kuma iya yin aikin pdftotext ta amfani da umarnin cURL. Yanzu, ɗayan abubuwan da ake buƙata don wannan hanyar ita ce samar da alamar samun damar JWT (bisa ga shaidar abokin ciniki) ta amfani da umarni mai zuwa.

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=bb959721-5780-4be6-be35-ff5c3a6aa4a2&client_secret=4d84d5f6584160cbd91dba1fe145db14" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

Da zarar an ƙirƙiri JWT, da fatan za a aiwatar da umarni na gaba don cire rubutu daga fayil ɗin PDF da aka rigaya ke cikin ma’ajiyar gajimare.

curl -v -X GET "https://api.aspose.cloud/v3.0/pdf/input.pdf/text?splitRects=true&LLX=0&LLY=0&URX=800&URY=800" \
-H  "accept: application/json" \
-H  "authorization: Bearer <JWT Token>"

Tukwici mai sauri

Neman PDF zuwa Text App kyauta! Da fatan za a gwada amfani da [PDF Parser] (https://products.aspose.app/pdf/parser).

Karshen Magana

A ƙarshe, cire rubutu daga fayilolin PDF ta amfani da Java na iya zama mafita mai ƙarfi ga waɗanda ke neman sarrafa sarrafa bayanansu da buƙatun bincike. Tare da taimakon wannan jagorar, yanzu kuna da tushe mai ƙarfi don ginawa kuma kuna iya aiwatar da mafita ta tushen Java cikin sauƙi don cire rubutu daga takaddun PDF. Ko kuna neman fitar da rubutu don nazarin bayanai, koyan na’ura, ko wata manufa, Java yana ba da dandamali mai sassauƙa kuma abin dogaro don buƙatunku. Don haka ci gaba da gwada sabbin ƙwarewar da kuka samu!

Idan kuna sha’awar bincika wasu abubuwan ban sha’awa waɗanda API ɗin ke bayarwa, da fatan za a bincika Takardun Samfura. A ƙarshe, idan kun ci karo da kowace matsala yayin amfani da API, ko kuna da wata tambaya mai alaƙa, da fatan za ku ji daɗin tuntuɓar mu ta kyauta Zauren Tallafin Samfura.

Labarai masu alaka

Da fatan za a ziyarci hanyoyin haɗin yanar gizon don ƙarin koyo game da: