Pytorch malware, Cosmopolitan Libc, Google Muse & VALL-E TTS model - CS News #1

Jan 09, 2023

Welcome to CS News #1!

CobolStone is a platform for developers to train themselves on modern technology challenges and learn new technical skills. The platform will launch publicly in the coming weeks.

The CS newsletter helps you keep track of the latest news in different technology domains like AI, Security, Software development, Blockchain/P2P and discover new interesting projects and techniques!

💡 Project Highlight: Cosmopolitan Libc

Cosmopolitan is a project made by Justine Tunney, an ex-Google Software Engineer, which aims to make the output of C compilation a binary that can be run on any platform (Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS) without VM or interpreter. Yes, it makes your C program runnable everywhere, you can even boot on it!

# create simple c program on command line

printf %s '
  main() {
    printf("hello world\\n");
  }
' >hello.c


# run gcc compiler in freestanding mode

gcc -g -Os -static -fno-pie -no-pie -nostdlib -nostdinc -gdwarf-4 \\
  -fno-omit-frame-pointer -pg -mnop-mcount -mno-tls-direct-seg-refs \\
  -o hello.com.dbg hello.c -Wl,--gc-sections -fuse-ld=bfd -Wl,--gc-sections \\
  -Wl,-T,ape.lds -include cosmopolitan.h crt.o ape-no-modify-self.o cosmopolitan.a
objcopy -S -O binary hello.com.dbg hello.com


# NOTE: scp it to windows/mac/etc. *before* you run it!
# ~40kb static binary (can be ~16kb w/ MODE=tiny)

./hello.com

This is achieved by outputing binary as an APE file (αcτµαlly pδrταblε εxεcµταblε), a custom file format which makes your program a valid POSIX polyglot file. It actually contains the multiple file formats of the targeted platforms and is run differently on each platform, as a Portable Executable on Windows or as a shell script on Unix systems.

Using the ape-no-modify-self.o bootloader (like in the example above), when the APE program is executed it won’t modify itself, instead it extracts the APE loader and runs it. This loader parses the header of the APE file and loads it in memory to execute it and waits for its execution to finish.

It is also possible to tell the APE binary to assimilate itself to the system format (ELF or Mach-O for example).

$ file hello.com
hello.com: DOS/MBR boot sector

$ ./hello.com --assimilate

$ file hello.com
hello.com: ELF 64-bit LSB executable

⇒ Discover Cosmopolitan

🔒 PyTorch discloses a malicious dependency

Between December 25, 2022 and December 30, 2022, the pytorch-nightly build could be installed with a malicious dependency that took precedence over the torchtriton dependency.

This “dependency confusion” technique enables the attacker to make a software supply chain attack by injecting its dependency over a dependency of a project. In this case a torchtriton package was uploaded to PyPi repository and took precedence in the pip install of the dependency with the same name used in the pytorch-nightly build.

The malicious package was downloaded over 2,500 times during this week and was designed to collect information like machine hostname, username, env variables and content of files like /etc/hosts, /etc/passwd or files in the $HOME directory before uploading them to a remote server.

If you believe to be affected, you can uninstall the torch packages:

$ pip3 uninstall -y torch torchvision torchaudio torchtriton  

$ pip3 cache purge

The malicious package has been removed from PyPi repositories.

⇒ Read more

🧠 Muse: Google’s new Text-To-Image model

Google Research released its new Text-To-Image model named Muse and claims it to be “more efficient due to the use of discrete tokens and requiring fewer sampling iterations”.

A bear riding a bicycle, with a bird perched on the handlebars.

In addition of achieving a new SOTA on Conceptual-Captions dataset (CC3M), Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing.

Even though there is currently no way to test it, Google Research team produced two versions of this model, the original one composed of 3B parameters and a smaller architecture reduced to 900M parameters.

⇒ Discover Muse model

🗣 VALL-E: A language modeling approach for Text To Speech synthesis

Microsoft added to its unilm repo a new Text-To-Speech (TTS) model called VALL-E that regards TTS as a conditional language modeling task rather than continuous signal regression, meaning that instead of using Mel spectrogram as intermediate state to waveform it uses discrete audio codec. This enables the usage of advanced prompting-based large-model techniques and allows to generate diverse synthesized results in TTS by using different sampling strategies during inference.

VALL-E uses a pre-trained discrete neural audio codec model called “EnCodec” to generate discrete tokens. It then outputs discrete audio codes autoregressively based on the text and 3-second acoustic input. Finally VALL-E uses a Transformer-based architecture to refine the generation quality.

With this technique, VALL-E achieve SOTA results in zero-shot Text-To-Speech tasks.

⇒ See VALL-E demo

Interesting stuff

Let’s discuss!

Follow on twitter @CobolStone

Join the community discord

CS News