Back in May 2019 we announced the release of OpenL3, an open-source python library for computing deep audio embeddings using an improved L3-Net architecture trained on AudioSet. Today we're happy to announce the release of OpenL3 v0.3.0 which can also compute deep image embeddings!
Installing OpenL3 is easy (requires TensorFlow):
Full details about the embedding models and how they were trained can be found in:
Look, Listen and Learn More: Design Choices for Deep Audio Embeddings
J. Cramer, H.-H. Wu, J. Salamon, and J. P. Bello.
IEEE Int. Conf. on Acoustics, Speech and Signal Proc. (ICASSP), pp 3852-3856, Brighton, UK, May 2019.
We hope the machine learning community, including both computer vision and machine listening researchers, find OpenL3 useful in their work!