Rookie problem: spiralling memory usage on inference

Hi! I’m transitioning from Tensorflow to PyTorch in my android application, and I’m running into a problem when I do inference - memory usage increases quickly until the app crashes. Android Studio reports that all of the memory usage is Native. I’m clearly abusing the library somewhere, or not releasing something that needs releasing. I’ve looked through most of the sample apps and can’t find any clues.

I’m processing 15 frames per second from a screen recording.

My code looks like this. A “slice” is just a data object holding the bitmaps I’m using (cropped, resized, original capture). I know those are being freed because when I remove the inference step but leave the rest of the image pipeline memory usage is fine.

class EvaluationFilter: Filter {
    // Lazy initialization so we can ensure the context is set up in the service.
    private val module: Module? by lazy {
        val path = assetFilePath("image_qual3.ptl")

    // Perform inference on a screen capture. Called about 15 times per second
    // while the user is recording.
    override fun processSlice(slice: Slice): Slice {
        slice.croppedScaledBitmap?.let { bitmap ->
            val inputTensor = TensorImageUtils.bitmapToFloat32Tensor(bitmap,
            val output = module?.forward(IValue.from(inputTensor))?.toTensor()
            val scores = output?.dataAsFloatArray
            scores?.let { slice.quality = it[1] }
            Log.d(TAG, "Scores are ${scores?.contentToString()}")
        return slice

Thanks for any help, I’m a bit of a loss. I’m sure it will be a forehead-slapper when it’s revealed.

I am not very familiar with android app development. Can you elaborate on what you mean here?

If the issue is memory leak in pytorch runtime, then we may have to figure out where it is coming from. Is this quantized model or floating point model? Do you know?