Getting Started with virtualenv


Why should you care about using virtualenv?  Basically it provides you with a development environment for your various python projects.  Not all python projects will call for the same libraries and being able to solely keep track of the needed libraries within a specific environment can make life easier by:

  • Reducing the # of dependencies that need to be maintained in a single environment
  • Reduce the risk of breaking program dependencies of versioned libraries
  • Replicate the needed environment in other machines without re-installing every single library that you may have done once upon a time

Here are some simple steps to get started with virtualenv on a brand new machine (should apply to both linux and os x)

On a new installation of Linux (or OS X instance)
1.) sudo apt-get install pip #install pip so you can install virtualenv
2.) sudo pip install virtualenv #install virtualenv so you can have controlled python environments

 
3.) virtualenv new_project  #create a project directory and execute this command where new_project = your project name for e.g.

 
4.) source new_project/bin/activate #to start your virtual environment
5.) pip install psycopg2 #install any python library in your new python virtual environment

 

 
6.) pip install jupyter #installs jupyter notebooks! (which can be kicked off on your browser)

 

 

 

7.) deactivate #when you are done working in your virtual environment, deactivate

 

From here, you can return to step 4 when you want to work on your project again. Or go to step 3 if you want to start a new project.

To capture your python virtual environment components, do the following:
pip freeze > requirements.txt

This way, if you need to recreate your environment, you can use the file by typing:
pip install -r requirements.txt

For more info, go here: http://python-guide-pt-br.readthedocs.io/en/latest/dev/virtualenvs/

Before I go, if you are wondering how to start jupyter in your web browser within your new virtualenv, via the command line execute the following:

source my_project/bin/activate #assuming you exited from your virtualenv

jupyter-notebook #your web browser should open after this

 

 

Leave a comment

Filed under Linux, Python

dlopen(Library not loaded: libssl.dylib) – fail to import psycopg2 in virtualenv


This is my scenario, I set up virtualenv on my MacBook (love that virtualenv btw).  I installed Jupyter in one of my Virtual Environments (pip install jupyter).  While running Jupyter, I attempted to import psycopg2 (I had previously installed this module in another virtualenv by the way). Then I received an error like the one below:

 “dlopen(Library not loaded: libssl.dylib)”

Fixing this is simple.  Copy the following files in to your /usr/local/lib folder

  • libssl.1.0.0.dylib
  • libcrypto.1.0.0.dylib

I found these files in my Postgres installation (/Library/PostgreSQL/9.5/lib), navigated there and then ran the following command

sudo cp libssl.1.0.0.dylib libcrypto.1.0.0.dylib /usr/local/lib/

After that, I am able to import psycopg2 on Jupyter running in my virtualenv with no problem!

Leave a comment

Filed under Python, Software Programs

Bayes Thereom – broken down


To arrive at an understanding of Bayes Theorem, we begin with the

Definition of a conditional probability

P(A|B) = translates to “The probability of A happening, given that event B has occurred”.

P(A|B) can be rewritten like

P(A | B) = P(A ∩ B)

                ————

                   P(B)

 

Also, P(A ∩ B) = P(A | B) × P(B) (i.e. the probability of A and B).

And because P(A ∩ B) = can also be expressed as P(B ∩ A)

We can say  P(B ∩ A) = P(B | A) × P(A)

Therefore, we can also express “the probability of A happening given event B” as

P(A | B) = P(B | A) × P(A)

                ——————–

                       P(B)

 

We can further expand the probability of B by writing it as

P(B) = P(A) × P(B | A) + P(Al) × P(B | Al )

(in normal language, the probability of B is equal to the probability of A multiplied by the probability of A given B, plus the probability of the complement of A multiplied by the probability of the complement of A given B)

 

Thus resulting in the Law of Total Probability.  

This law enables us to find the total probability of a particular event based on conditional probabilities.  Also, our new expression can be substituted into our formula for “the probability of A happening given event B” as the denominator to give us Bayes Theorem.

Bayes Theorem provides a means of finding reverse conditional probabilities when you don’t know every probability up front.

 

Bayes Theorem

P(A | B) =                     P(A) × P(B | A)

                     ——————————-

                      P(A) × P(B | A) + P(Al) × P(B | Al)

Leave a comment

Filed under Math, Sharing Stuff

tPostgresqlOutput_3 org.postgresql.util.PSQLException: ERROR: length for type varchar must be at least 1


There are several reasons why you may receive this error when using Talend Open Studio, but the one I recently encountered was user error.  When I created my schema, I had placed my varchar length values under the “Precision” column instead of “Length” column.  This was the cause of my error.

Misplaced varchar lengths in Talend Open Studio schema

I prefer these errors over product bugs!

 

 

Leave a comment

Filed under SQL

Listing out Built in Functions in Python


To retrieve a list of Built in Functions in Python, type:  dir(__builtins__)

You will get the following list

Firefox_Screenshot_2016-01-31T04-16-41.818Z

To understand what each function does, you can type help(fuction_name) to get more details on that function.  Example: help(NameError)

Firefox_Screenshot_2016-01-31T04-20-36.900Z

 

Leave a comment

Filed under Computer - Technical, Python

List of Built in Functions in Python


To get a list of built in functions in Python, type the following:

It will result in a list like the one below

 

 

Leave a comment

Filed under Sharing Stuff

Talend Data Integration – Exception in component tOracleOutput_1java.lang.ArrayIndexOutOfBoundsException: -32703


You are probably seeing this error as you attempt to transfer data with many date values from one Oracle database to another Oracle database using Talend Studio.

Well quick fix (not sure why it works) is to decrease the commit size in your OracleOutput node. The default is 10,000.  I reduced mine to 500.  See illustration below.

Changing Commit In Talend Data Integration

Comments are always welcome.

Leave a comment

Filed under SQL