• Decompiling Python .pyc Files

    Have you accidentally deleted an important Python source file or are looking to inspect the contents of a .pyc file that has been provided to you? Luckily .pyc files contain enough information to reproduce the corresponding .py file. You won’t get the original comments or the original formatting, and there may be a few tweaks you need to do for the new .py file to be completely valid–but this can be a savior for some unexpected loss of source files if the .pyc files still exist.

    The best way I have found to do this is pycdc (you can find the github here).

    To build the C++ executables do the following steps. You must have cmake installed.

    git clone https://github.com/zrax/pycdc.git
    cd pycdc
    cmake .
    make
    

    Once pycdc is built you can use it in the following manner to produce a .py file from a .pyc file:

    path/to/pycdc path/to/file_of_interest.pyc > file_of_interest.py
    

    In cases where I have functions that accept arbitrary keyword arguments with **kwargs and pass those on to subsequent functions in the same way, I have to convert **None syntax back to **kwargs within the decompiled Python code.

    # Decompiled output
    def myfunc(**kwargs):
        result = otherfunc(**None)
        return result
    
    # Should be changed to
    def myfunc(**kwargs):
        result = otherfunc(**kwargs)
        return result
    
  • Installing NetCDF Python Packages

    I am always trying to remember how I have installed netCDF4 and related libraries for Python, and what I need to do differently for Windows systems vs. the Linux systems I usually use.

    On Linux, sometimes I use the system netCDF C libaries, but often I compile and install specific versions of HDF5 and netCDF4 from scratch. Here is how I have built netCDF for various Docker container images.

    # Build HDF5
    cd hdf5-x.y.x
    ./configure --prefix=/usr/local --enable-shared --enable-hl
    make
    make install
    cd ..
    
    # Built NetCDF4
    cd netcdf-x.y.z
    LDFLAGS=-L/usr/local/lib
    CPPFLAGS=-I/usr/local/include
    ./configure --enable-netcdf-4 --enable-dap --enable-shared --prefix=/usr/local --disable-doxygen
    make
    make install
    cd ..
    
    # NetCDF4 Fortran
    cd netcdf-fortran-x.y.z
    ./configure --enable-shared --prefix=/usr/local
    make
    make install
    cd ..
    

    netcdf4-python

    I typically try to use pip to install Python libraries if I can. The basic NetCDF module for Python is called netcdf4-python, but it can be installed using the name below.

    pip install netcdf4
    

    Many people, particularly in industry and government, have adopted Continuum Analytics’ Anaconda Python distribution which includes a large set of pre-included packages and a clever package management system called conda. This package management system attempts to solve dependencies by up- or down-grading packages to best meet the requirements of the full set of installed packages in the distribution. Build scripts can include binary dependencies and headers that live in their own isolated environment so that they don’t interfere with necessary system libraries/dependencies.

    conda install netcdf4
    

    xarray

    A great package that attempts to provide a Pandas-like interface to multidimensional arrays that come from file formats like netCDF is xarray. A great aspect of this package is the incorporation of dask to allow for out-of-core computations on these potentially large (and multi-file) arrays.

    # pip
    pip install xarray
    
    # Anaconda
    conda install xarray
    
  • Using the Blockchain for Open-access Journals?

    One thing that excites me about the current buzz around blockchain technology is its use for open science. I can’t speak to the feasibility, but it seems to me that a distributed ledger could be an ideal place to publish and provide open-access to scientific research papers and articles.

    If including a way to store, deliver and update supporting data, a blockchain could deliver research products that link directly to the data and analysis–providing an unparalleled level of provenance and context for research and results. Citations and work building on similar pieces of data could be connected allowing for straighforward literature searches and discovery.

    Despite the legal complications of the Sci-Hub project, it seems to me that integrating a distributed ledger like technology would be an ideal approach to accomplish their goals in a robust and open way.

    I haven’t looked too deeply, but are there groups trying something like this out?

  • First Post!

    First post!!