Skip to content

DARC Blog

How Do I Extract Compressed Files?

Compressed files are commonly used to save storage space and simplify file transfers, especially when dealing with large datasets or collections of files. Knowing how to uncompress these files is essential for quick and efficient access to your data. This guide will walk you through the process of uncompressing files on the Yens, covering several common compression formats. By following the examples below, you'll be able to handle compressed files effectively, whether for transfer or long-term storage.

SEC Filings

All companies are required to file registration statements, periodic reports, and other forms to the SEC and these filings are popular sources of data for researchers at the GSB. This post covers some of the resources available to facilitate research using these filings and a few sample workflows.

Installing Software On The Yen Servers

As a yen user, you can install your own custom software in your home directory or any location where you have permissions (such as a shared project space). If you are working with other researchers on a shared project, it is a good idea to have a dedicated shared software directory where you can install required software.

Working with Large Zip Files in Python

How to work with large zip files without unzipping them

Problem

You need to access data from a zip file, but you don’t want to copy the zip file to your home/project directory and unzip it. How can you access this data efficiently?

Here is an example of a directory containing zip files and other files. It includes two notebooks, a sample zip file, and its unzipped contents stored in the zipcontents folder.

Word Embeddings

What are Word Embeddings?

Word Embeddings are a method to translate a string of text into an N-dimensional vector of real numbers. Many computational methods are not capable of accepting text as input. This is one method of transforming text into a number space that can be used in various computational methods.