Skip to content

[Bug]: A memory leak occurred when processing some special PDF files. #20046

Open
@zhou7510

Description

@zhou7510

Attach (recommended) or Link to PDF file

test-2.pdf

Web browser and its version

node 20.17.0

Operating system and its version

windows11

PDF.js version

5.3.31

Is the bug present in the latest PDF.js version?

Yes

Is a browser extension

No

Steps to reproduce the problem

import fs from 'node:fs/promises';
import path from 'node:path';
import { fileURLToPath } from 'node:url';
import { getDocument } from 'pdfjs-dist/legacy/build/pdf.mjs';

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const CMAP_URL = new URL('./node_modules/pdfjs-dist/cmaps/', import.meta.url).href;
const STD_FONT_URL = new URL('./node_modules/pdfjs-dist/standard_fonts/', import.meta.url).href;

(async () => {

const pdfPath = path.resolve(__dirname, 'test-2.pdf');
const data = new Uint8Array(await fs.readFile(pdfPath));

const loadingTask = getDocument({
data,
cMapUrl: CMAP_URL,
standardFontDataUrl: STD_FONT_URL,
disableWorker: true,
});
const pdfDoc = await loadingTask.promise;
console.log(PDF loaded, total pages: ${pdfDoc.numPages});

const page = await pdfDoc.getPage(1);
const opList = await page.getOperatorList();
console.log('Operator list length =', opList.fnArray.length);
})();

What is the expected behavior?

The PDF file I gave him is a small blank file, so he should be able to process it quickly and return it.

What went wrong?

I encountered a memory leak when running the following code.

Image

Link to a viewer

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions