V2 - Copy failed: Error: Unknown system error (2.00.01) #318

wsegatto · 2021-01-22T19:38:20Z

Describe the bug
Copy operation from the transcode folder to the output folder fails constantly.

Copy failed: Error: Unknown system error -139412464: Unknown system error -139412464, copyfile '/home/pi/tmp/Transcode/DARK-TdarrCacheFile-7CKwlGW4b.mkv' -> '/home/pi/tmp/Output/DARK-TdarrCacheFile-meuGljGGn.mkv'

To Reproduce
In my case, I just have to convert any file.

Expected behavior
Files should be moved to the Output folder. If operation fails (e.g. drive not mounted), a retry/skip button should be available. Already transcoded files in the queue awaiting copy should be saved, so if I restart the server, the copy should continue, not the entire transcode again. Logs should be clear about the error.

Screenshots

Desktop:

OS: Raspberry PI OS ARMv7 32 bits:latest

Additional context
I'm running Tdarr 1.99.09.

HaveAGitGat · 2021-01-23T16:06:09Z

The staging section has been saved on server restart since .10 and a retry copy button was added in .11. Worthing trying .14.

wsegatto · 2021-01-23T22:52:02Z

Thank you for the info! I'll give it a try once I'm able to open .14 I have opened #319 for this.

wsegatto · 2021-01-24T20:19:27Z

Just had my first successful transcode and copy to output folder and clearing of the transcode folder in the process with V2.
Seems like the issue is solved in Debian Buster (ARMv7)!

Will try now with a CIFS mount in my Samba share.

wsegatto · 2021-01-24T21:32:14Z

OK, so the issue continues, but I have an insight here.
I monitored the remote folder while Tdarr was copying to the Output folder. The file was totally copied but then got deleted and Tdarr said the copy failed. There was no info in the logs, only:

Error: Unknown system error -138642310: Unknown system error -138642310, copyfile in Tdarr.

Could you please verify what exactly you run after the copy? Maybe the renaming is failing. If you provide me the code and I simulate directly in the console to see what error I get.

jasonsansone · 2021-01-26T02:54:39Z

I am having a similar but slightly different error. I didn't open a new issue in case it is a duplicate.

Encoding and copying works on the master container which has all modules (Server, WebUI, Node). My slave containers are all receiving errors when attempting to copy from their local transcode cache location back to the storage directory after a successful encode. I am using lxc containers on Proxmox with CephFS for the cluster storage backend, so paths are all the same. Transcode cache is always /opt/scratch which is local to the lxc container and backed by an NVMe storage cluster. This all worked fine on v1, but that was before remote nodes existed in v2. I can manually copy the file from the scratch directory to the storage cluster. I have checked permissions on all containers. Promox nodes are linked by 2x 10GBe fiber in LACP, so the network is not a bottleneck.

Is the transcode cache supposed to be on a network accessible location and not local? The new instructions hint that is the case. If so.... what is the point of having a "cache" directory? The network share will be the same speed and open to the same issues for both the cache and final destination. Shouldn't the flow be to write to a local drive while reading from the network share, verify successful transcode, move the local cached transcode to the remote share, delete original file, rename transcode to original filename? That appears to be exactly what happens (and what I want to happen) when using the master node with the Server module. It only fails on "remotes".

Error: ENOENT: no such file or directory, stat '/opt/scratch/redactedfilename-TdarrCacheFile-xrB4X3mdq.mkv'

wsegatto · 2021-01-26T04:00:31Z

I had that issue with a Samba Share and it turned out to be a mount issue. Are your files in the remote directory appearing as being owned by root when you ls -la in the folder? I solved it by adding these options when mounting the dir uid=1000,gid=1000,dir_mode=0777,file_mode=0777.

By cache you mean transcode cache? Well, if you have to make multiple passes to the file, it does in this temp area and only at the end copy and remove the files to the Output folder. Transcode folder should be local in the node as far as I know.

wsegatto · 2021-01-26T04:08:04Z

OK, so the issue continues, but I have an insight here.
I monitored the remote folder while Tdarr was copying to the Output folder. The file was totally copied but then got deleted and Tdarr said the copy failed. There was no info in the logs, only:

Error: Unknown system error -138642310: Unknown system error -138642310, copyfile in Tdarr.

Could you please verify what exactly you run after the copy? Maybe the renaming is failing. If you provide me the code and I simulate directly in the console to see what error I get.

@HaveAGitGat, when you get the chance.

wsegatto · 2021-01-27T03:38:27Z

This is what's happening in my output mount (file size is 2.7GB)

Tdarr.Copy.mp4

wsegatto · 2021-01-27T19:58:42Z

It seems the issue only happens for files larger than 2GB, it's not related to NFS or CIFS.
This was supposed to be fixed in NodeJS v10, but the issue occurs with v12 as well.

nodejs/node#30085

wsegatto · 2021-01-27T23:13:57Z

This seems related. It seems copyfilesync throws errors if the size of the file exceeds the integer size in bytes, overflows it and turns negative.
https://gitmemory.com/issue/nodejs/node/30085/545466474

const fs = require("fs");

fs.copyFileSync("./2GB.bin", "./2GB-copySync.bin");

fs.copyFile(
  "./2GB.bin",
  "./2GB-copyASync.bin",
  console.log
);

fs.promises
  .copyFile("./2GB.bin", "./2GB-copyPromise.bin")
  .then(console.log, console.error);

supersnellehenk · 2021-01-28T00:55:07Z

Just had a brief conversation on Discord with @wsegatto, as noted in his OS he's running a 32 bit system. These systems don't really support stuff over 2 GB in size. So there will probably be a significant rewrite (I don't know the code, so can't really tell) to make it work on 32 bit systems. I think something like reading the first 2 GB of the file, then reading each 2 GB after that by an offset.

Node issue can be ignored as this isn't a node issue but rather a max integer issue due to 32 bit.

jasonsansone · 2021-01-28T00:58:52Z

The problem still exists that the functionality for moving files appears to be handled by the server and not the node, thus a node can not utilize a transcode cache location that isn’t accessible to the server.

supersnellehenk · 2021-01-28T01:02:20Z

The problem still exists that the functionality for moving files appears to be handled by the server and not the node, thus a node can not utilize a transcode cache location that isn’t accessible to the server.

That's not related to this issue, but that's by design. It's easier on the network, plus other nodes can access the same files. So three independent nodes can finish a single file. So node 1 runs plugin 1, node 1 picks up a different file to run a plugin on, node 2 runs plugin 2, etc. as an example. It's easier on the network because it doesn't hammer the full network bandwidth once it's done doing all the plugins, but gradually streams the file back into the cache at the same rate as transcoding.

jasonsansone · 2021-01-28T01:05:23Z

Haha your network maybe. I have 20Gbe. But understood.

edit: even for 1Gbe LAN, that’s still 125MBs which exceeds most consumer hard drive mixed R/W speeds. I’m not sure the network would be that much of a bottleneck, if any. Storage usually is the constraint. 10GBe is obviously 1250MBs, which is going to exceed anything short of flash based storage. Exceeding 1.2GBs on spinning rust is an impressive metric.

wsegatto · 2021-01-28T01:56:07Z

Thanks for your help guys. My point here is that the file IS copied but then it's deleted. I mean, it doesn't seem like the issue is the copy procedure physically, but rather the method of copy.

In any case I'll try with my other RPI that has 64 bits on to see its behavior.

EDIT: Yeah, I don't think I'll be able to try in 64 bits now. The updater never worked in my Debian Buster:

root@raspberrypi:/opt/Tdarr# ./Tdarr_Updater 
./Tdarr_Updater: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

root@raspberrypi:/opt/Tdarr# apt install libatomic1
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libatomic1 is already the newest version (8.3.0-6).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.

root@raspberrypi:/opt/Tdarr# node -v
v15.7.0

root@raspberrypi:/opt/Tdarr# npm -v
7.4.3

jasonsansone · 2021-01-28T02:16:14Z

I don’t have anything ARM based, but I can test anything else that is x86_64. It’s easy to spin up a VM of Debian or anything else *nix.

wsegatto · 2021-01-28T02:28:35Z

I have manually pointed node to the libatomic library but then I get another error. :(

pi@raspberrypi:/opt/Tdarr $ LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu

pi@raspberrypi:/opt/Tdarr $ export LD_LIBRARY_PATH

pi@raspberrypi:/opt/Tdarr $ ./Tdarr_Updater 
./Tdarr_Updater: error while loading shared libraries: libatomic.so.1: wrong ELF class: ELFCLASS64

Why is this hard? :(

HaveAGitGat · 2021-01-28T06:29:11Z

Tdarr uses fs-extra for this:

const fs = require('fs-extra')

const filePath = 'C:/test'
const newFilePath = 'C:/test_new'

fs.copy(filePath, newFilePath)
  .then(() => console.log('success!'))
  .catch(err => console.error(err))

You could try running the above script with a 2GB+ test file. It seems no error is being given but the new file is being automatically deleted by fsextra when it fails. Tdarr then sees it as successful and deletes the cache file.

wsegatto · 2021-01-28T13:30:08Z

@HaveAGitGat I have executed the script you mentioned and I got a success message. The file had 2.7GB and was completely copied to the output folder with no errors at all.

> fs.copy(filePath, newFilePath).then(() => console.log('success!')).catch(err => console.error(err))
Promise { <pending> }
> success!

It seems no error is being given but the new file is being automatically deleted by fsextra when it fails. Tdarr then sees it as successful and deletes the cache file.

That is not what happens. Tdarr is not seeing as successful and is not deleting the cache file. It throws the error I'm reporting in this issue. The file stays there in the transcode folder and the output file is deleted.

I'm trying with Tdarr again. FYI: I'm running 2.00.00 for all 3 modules since Node_Server 2.00.01 is not yet available for arm (I believe the packages are inconsistent since 2.00.01 is available for Node and WebGUI).

wsegatto · 2021-01-28T15:41:51Z

Nope, copy failed. The script works with no issues, @HaveAGitGat, but inside Tdarr the copy fails with errors consistently for files over 2GB.

jasonsansone · 2021-01-28T15:48:42Z

Thanks for your help guys. My point here is that the file IS copied but then it's deleted. I mean, it doesn't seem like the issue is the copy procedure physically, but rather the method of copy.

In any case I'll try with my other RPI that has 64 bits on to see its behavior.

EDIT: Yeah, I don't think I'll be able to try in 64 bits now. The updater never worked in my Debian Buster:
root@raspberrypi:/opt/Tdarr# ./Tdarr_Updater 
./Tdarr_Updater: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

root@raspberrypi:/opt/Tdarr# apt install libatomic1
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libatomic1 is already the newest version (8.3.0-6).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.

root@raspberrypi:/opt/Tdarr# node -v
v15.7.0

root@raspberrypi:/opt/Tdarr# npm -v
7.4.3

Tested 2.00.01 on Debian Buster. Everything works fine. However, that is amd64 arch, not ARM. Whatever the issues are, they are confined to ARM arch.

wsegatto · 2021-01-28T15:57:11Z

Thanks for the test. 2.00.01 is still not fully available in ARM, packages are inconsistent (latest Server version is still in 2.00.00).

[2021-01-28T10:56:19.047] [INFO] Tdarr_Updater - Tdarr_Node | Current version: 2.00.01 | Required version: 2.00.01
[2021-01-28T10:56:19.056] [INFO] Tdarr_Updater - Tdarr_Node | Up to date! Version: 2.00.01!
[2021-01-28T10:56:19.058] [INFO] Tdarr_Updater - 
[2021-01-28T10:56:19.059] [INFO] Tdarr_Updater - 
[2021-01-28T10:56:19.061] [INFO] Tdarr_Updater - Tdarr_Server | Current version: 2.00.00 | Required version: 2.00.01
[2021-01-28T10:56:19.062] [WARN] Tdarr_Updater - Tdarr_Server | Not the required version! Updating tDownloading 0.00 KB/s:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>(100%)
[2021-01-28T10:56:21.637] [ERROR] Tdarr_Updater - Tdarr_Server | No module for version 2.00.01 on linux_arm
[2021-01-28T10:56:21.638] [ERROR] Tdarr_Updater - Tdarr_Server | Error downloading!
[2021-01-28T10:56:21.639] [INFO] Tdarr_Updater - 
[2021-01-28T10:56:21.640] [INFO] Tdarr_Updater - 
[2021-01-28T10:56:21.641] [INFO] Tdarr_Updater - Tdarr_WebUI | Current version: 2.00.01 | Required version: 2.00.01
[2021-01-28T10:56:21.642] [INFO] Tdarr_Updater - Tdarr_WebUI | Up to date! Version: 2.00.01!
[2021-01-28T10:56:21.644] [INFO] Tdarr_Updater - 
[2021-01-28T10:56:21.644] [INFO] Tdarr_Updater - Finished!

HaveAGitGat · 2021-01-28T17:05:50Z

@wsegatto sorry about that, it should now be up.

wsegatto · 2021-01-28T17:11:11Z

I don't know how Node works - whether it uses my local libraries or the libraries during compilation, but I can say for sure that all copy methods seem to work when executed directly with node.

> fs.copy(filePath, newFilePath).then(() => console.log('success!')).catch(err => console.error(err))
Promise { <pending> }
> success!

> try {
...   fs.copySync(filePath, newFilePath)
...   console.log('success!')
... } catch (err) {
...   console.error(err)
... }
success!
 
> try {
...   fs.copyFileSync(filePath, newFilePath)
...   console.log('success!')
... } catch (err) {
...   console.error(err)
... }

success!

wsegatto · 2021-01-28T19:24:36Z

Issue continues in v2.00.01. I guess Tdarr is not for ARM. I'm about to give up. The updater doesn't even start in armv8 64 with Debian Buster and in 32 bits there are copy issues.

I even tried to download the latest fs-extra and input it to the node_modules folder but nothing changed.
I can only imagine there is a problem with Tdarr.

Is there a way to debug this? With no logs it's just a wild guess here.

jasonsansone · 2021-01-28T22:53:34Z

I realized I can emulate ARM on Proxmox. What build / OS are you using? I can try to make a test VM. I am happy to spin up whatever we need to help debug.

wsegatto · 2021-01-28T22:55:44Z

I realized I can emulate ARM on Proxmox. What build / OS are you using? I can try to make a test VM. I am happy to spin up whatever we need to help debug.

Thank you so much.
I'm running Raspberry Pi OS 32 bits, latest version, which is a Debian Buster, armv7/armhf.

wsegatto · 2021-01-29T04:13:39Z

The test is simple. Just setup a library with source/transcode/output folders, then use a plugin like Tdarr_Plugin_a9hd_FFMPEG_Transcode_Specific_Audio_Stream_Codecs or
Tdarr_Plugin_MC93_Migz4CleanSubs. Use an input file larger than 2GB with a margin to spare, like 2.5GB.
The copy from the transcode folder to the output folder will fail. Doesn't have to be a remote folder. Local folders also have this problem.

HaveAGitGat · 2021-01-30T12:17:59Z

I’ll have another look at this today, thanks

HaveAGitGat · 2021-01-30T15:53:33Z

This should now be fixed in 2.00.02 just released. Working fine with a 4GB file on my Pi.

wsegatto · 2021-01-30T18:07:48Z

Hi everyone!

Thank you so much for the help. I can confirm the issue no longer occurs in 2.00.02.
@HaveAGitGat, for clarity purposes, would you mind sharing what have you changed?

Thanks! You just got a new Patreon.

HaveAGitGat · 2021-01-31T09:35:37Z

@wsegatto I didn't do too much investigating but it's an issue with fsextra when it's packaged up. I tried some other file copy libraries and cp-file seems to do the job.

HaveAGitGat · 2021-01-31T09:37:45Z

Glad it finally works :)

wsegatto · 2021-01-31T13:31:58Z

Thank you so much! May this also serve as a suggestion for so many other projects facing the same issues. :)
I'm running transcodes for my entire library now. Thank you!

marshalleq · 2021-02-07T23:41:03Z

For posterity, 32bit file systems have been dealing with files larger than 2GB for decades - 32 bit should have zero to do with this issue.

wsegatto mentioned this issue Jan 22, 2021

Transcoding fails due to "Error: EPERM: operation not permitted," #309

Closed

jasonsansone mentioned this issue Jan 27, 2021

Cache copy Fails as File Does Not Exist #333

Closed

wsegatto changed the title ~~V2 Preview - Copy failed: Error: Unknown system error~~ V2 - Copy failed: Error: Unknown system error (2.00.00) Jan 28, 2021

wsegatto changed the title ~~V2 - Copy failed: Error: Unknown system error (2.00.00)~~ V2 - Copy failed: Error: Unknown system error (2.00.01) Jan 29, 2021

wsegatto closed this as completed Jan 30, 2021

V2 - Copy failed: Error: Unknown system error (2.00.01) #318

V2 - Copy failed: Error: Unknown system error (2.00.01) #318

Comments

wsegatto commented Jan 22, 2021 • edited Loading

HaveAGitGat commented Jan 23, 2021

wsegatto commented Jan 23, 2021 • edited Loading

wsegatto commented Jan 24, 2021

wsegatto commented Jan 24, 2021

jasonsansone commented Jan 26, 2021

wsegatto commented Jan 26, 2021

wsegatto commented Jan 26, 2021 • edited Loading

wsegatto commented Jan 27, 2021 • edited Loading

wsegatto commented Jan 27, 2021

wsegatto commented Jan 27, 2021

supersnellehenk commented Jan 28, 2021

jasonsansone commented Jan 28, 2021

supersnellehenk commented Jan 28, 2021

jasonsansone commented Jan 28, 2021 • edited Loading

wsegatto commented Jan 28, 2021 • edited Loading

jasonsansone commented Jan 28, 2021

wsegatto commented Jan 28, 2021

HaveAGitGat commented Jan 28, 2021

wsegatto commented Jan 28, 2021 • edited Loading

wsegatto commented Jan 28, 2021

jasonsansone commented Jan 28, 2021

wsegatto commented Jan 28, 2021

HaveAGitGat commented Jan 28, 2021

wsegatto commented Jan 28, 2021

wsegatto commented Jan 28, 2021 • edited Loading

jasonsansone commented Jan 28, 2021

wsegatto commented Jan 28, 2021

wsegatto commented Jan 29, 2021

HaveAGitGat commented Jan 30, 2021

HaveAGitGat commented Jan 30, 2021

wsegatto commented Jan 30, 2021

HaveAGitGat commented Jan 31, 2021

HaveAGitGat commented Jan 31, 2021

wsegatto commented Jan 31, 2021

marshalleq commented Feb 7, 2021

wsegatto commented Jan 22, 2021 •

edited

Loading

wsegatto commented Jan 23, 2021 •

edited

Loading

wsegatto commented Jan 26, 2021 •

edited

Loading

wsegatto commented Jan 27, 2021 •

edited

Loading

jasonsansone commented Jan 28, 2021 •

edited

Loading

wsegatto commented Jan 28, 2021 •

edited

Loading

wsegatto commented Jan 28, 2021 •

edited

Loading

wsegatto commented Jan 28, 2021 •

edited

Loading