Replies: 4 comments 7 replies
-
MongoDB can store binary files into its GridFS, this is currently used to store the submission payload. I was always skeptical about the overhead for exporting the whole task files to the container filesystem but it may be acceptable as the MongoDB cache may be more efficient than a classic filesystem cache, if there is no too many files. Tasks should remain serialisable in a filesystem format as most users rely on modifying the files on their machines, and to allow sharing courses between instances |
Beta Was this translation helpful? Give feedback.
-
One the main question is the performance. MongoDB is fast, but will never be as fast as a local filesystem. Even if the filesystem is remote (nfs, sshfs, whatever) it's probable that the performance would be better than MongoDB, notably due to serialization and the way messages are passed by MongoDB on the wire. There are some tasks with multi-GB folders, and clearly redownloading them from MongoDB each time a task starts is not possible. A kind of local caching will be needed, and we will not be able to use the caching system of the kernel (above NFS/sshfs), so it would be on our end to implement it. As an alternative, maybe we can look at MongoDB-based FS compatible with FUSE? |
Beta Was this translation helpful? Give feedback.
-
#834 is another related feature we should consider in this discussion. |
Beta Was this translation helpful? Give feedback.
-
We should also take the opportunity to reconsider the syllabus integration with this discussion. It could play nicely with the new concept of Course. |
Beta Was this translation helpful? Give feedback.
-
Currently, tasks and courses are stored on a file system interfaced by the
LocalFSProvider
class.This implies that this FS has to be mounted somehow in each environment running a frontend instance, e.g. https://github.com/UCL-INGI/INGInious/pull/882/files#diff-e45e45baeda1c1e73482975a664062aa56f20c03dd9d64a827aba57775bed0d3R69-R70.
Even worse, this FS also has to be mounted in the Agents. See #352 for more details and https://github.com/UCL-INGI/INGInious/pull/882/files#diff-e45e45baeda1c1e73482975a664062aa56f20c03dd9d64a827aba57775bed0d3R30-R32 for an example.
We could use the MongoDB instances backing INGInious to store the tasks and the courses. We could also take this opportunity to decouple tasks from courses.
This discussion proposes a new way to store tasks and courses within MongoDB.
Any inputs are welcome to validate/improve this new design.
New Task Model
task.yaml
file is directly embedded as a sub-document of the task. Rather than embedding directly serialized files in the task, one could embed only an ObjectId pointing to the serialized file. Images can be stored asbase64
encoded strings.git diff
alike structures for the task's files.New Course Model
course.yaml
as a sub-document.Additional notes
FileSystemProvider
interface to minimize the changes within the frontend code.Beta Was this translation helpful? Give feedback.
All reactions