DERI has a dedicated HPC cluster called "Andrena" consisting of 64 NVidia A100 GPUs.
We have produced some videos specifically for Andrena users, to help you get Tensorflow and Pytorch environments set up, and run your first job, and check the output. We will follow with similar videos for pip/virtualenv environments soon. We also provide detailed written instructions on the documentation site https://docs.hpc.qmul.ac.uk/apps/ml/pytorch/ and https://docs.hpc.qmul.ac.uk/apps/ml/tensorflow/
Tensorflow and Conda on Andrena in 10 minutes: https://www.youtube.com/watch?v=SBjKXxr8yE4&list=PLXHLuk6-uov252OfLFzQhAHcqB0GfovIq
Pytorch and Conda on Andrena in under 10 minutes: https://www.youtube.com/watch?v=LyIvetWGtzo&list=PLXHLuk6-uov252OfLFzQhAHcqB0GfovIq&index=2
Jupyter notebooks:It’s also possible to run Jupyter notebooks on Andrena GPU nodes through ondemand, similar to Google Colab. The following documentation guides you through the steps, including instructions for adding visibility for additional environments beyond the base anaconda environment https://docs.hpc.qmul.ac.uk/ondemand/jupyter/ It’s also possible to use TensorBoard, R-studio and Matlab, etc through this service.
Getting helpAndrena is managed by QMUL’s ITS Research team, a friendly and experienced team who can help with all queries regarding use of the service, training, code reviews and recommendations regarding good practice. The team can be contacted at its-research-support@qmul.ac.uk and your enquiry will go directly to a staff member with relevant expertise.