Practical Guide: Git and GitHub  for Data Science

Practical Guide: Git and GitHub for Data Science

Hello guys, in this article I've added all the necessary commands and guide for developing any project on git. all information mentioned in this article is from scratch so, prior knowledge is not required to kickstart.

This article is beginner-friendly and anyone who wants to kickstart the development of their data science projects on git can follow this.

So without further ado, let’s get onto it:)

Let’s first understand what the git is ?? 🤔

What is Git??

Git is an open-source distributed version control software to track the changes made in any set of files. In a simple scenario, if you and me working on a project from different machines and both made changes in the files… again when the manager checks that and says to not make any changes, make it as before then it becomes brainstorming to work on that code again. so, By using git whatever changes we made we can save that as a commit and get back to that version of the file again… interesting right🤩

Git also makes collaboration easier, allowing changes by multiple people to all be merged in a single source.

Now I’ll directly go through the practicals on how to use it as a developer.

On Local Machine

I’m using GitBash to work on git, you can use your Command Line as well.

create folders for different projects to work on e.g. in my case created Developer1 and Developer 2. and go inside the Developer1 folder and initialize git using the git init command.

The significance of git init is, that it initializes the .git environment for our project as environment setup. it makes folders empty repositories, .git file has all the things that any folder should have for tracking the system/version control on local machine. it’s like a container of git i.e. folder can work same on any os.

Now configure the folder using the username — git config user. name “developer1” and git config user.email ”.

Now create a file inside the folder to track e.g. git_developer using touch dev.html and modify it using vim dev.html as below.

Now to add files to the git environment use git add filename or simply git add . and then use git commit -m “message” to save the version or modification details.

Now try over different modifications and each time after commit we get a unique version id so that if required we can go to that version of the file as well.

Here’s what happened internally under the above commands.

Now let’s do the Time Travel🚀 . below is the dev.html file after third commit. using git Checkout “version_id” we go back to previous versions as per the version_id.

Before Checkout

After Checkout directly to the first version.

As before we worked over the local machine now let’s see how it worked on Github

On GitHub

Now the git folder that we created GitPractice/Developer1 if that folder to be push on github repo then use the below commands.

Note: You should have a GitHub account and create a new repository to copy the folder in.

Initially before copying the repo is empty.

  1. git remote add origin github_repo_link: it connects my local repo to GitHub

  2. git push origin master: to push local files on Github.

After Push

again if we want to convert github repo to local folder i.e. to clone github repo use the git clone “github_repo_url” so it’ll download all the files of github repository on your local machine.

Team Collaborating with Git

Now we’ve pushed our files on github and now developer2 want’s to make changes into that then developer2 will connect with repository by developer1 and pull the dev.html file. after doing some modification it pushes back to the github repository.

BAM😮!! push rejected why???

It happens in such in a scenario that as developer1 pushes all versions on github repository. when developer2 pull the dev.html file from that github repository and made changes then those will get reflected on github and new version got created but, if developer1 again commit the dev.html file and he want’s to push that to github repo… then we faced such issue… know an Conflict problem.

Hence to resolve the conflict problem… pull the developer2 version from github and resolve that locally… then push the developer1 version it’ll resolve the issue and conflict version commit got created.

Now it’s working🥳

And also modification have also successfully reflect on dev.html file

Git Branches

Branches allow you to develop features, fix bugs, or safely experiment with new ideas in a contained area of your repository.

Commands for git braches creation and merge
1. git branch <branch_name>    => to create branch.
2. git checkout <branch_name>  => to direct to another branch.
3. git branch                  => to get the current branch name.
4. git merge <branch_name>     => to reflect the changes into another branch from <branch_name> branch.

This is all about git and github… just try it once it’s really interesting and you’ll get it definitely:)

If you liked🤝 this article do comment and you can connect me on LinkedIn or follow me on github and kaggle .

Happy Learning:)