How to control the version of a data analysis project

how can a data analysis project be versioned effectively?

where version control is required and which is not.

how to manage the charts generated in the data analysis project.


basically my plan is to use jupyter notebook . Put some intermediate results (stored in Pickle) and functions used by Pipeline in the tool module, then display the version by the label of Notebook, and finally use git to do version control. For example:

-- project
  |__ data:
      |__ SQL:SQL
      |__ pickle:
  |__ src:Notebook
  |__ notebooks:
      |__ 0.0 contents and introduction.ipnb:notebook
      |__ 1.0 EDA.ipnb
      |__ 1.1 .ipnb
      |__ 1.2 .ipnb
      |__ 2.0 EDA.ipnb
      |__ ...
      |__ end.0 .ipnb
  |__ temp_module:notebook
  |__ README
MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b374f8-2c0b0.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b374f8-2c0b0.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?