README.md

   1 # KarmaWorld
   2 __Description__: A django application for sharing and uploading class notes.
   3
   4 __Copyright__: FinalsClub, a 501c3 non-profit organization
   5
   6 __License__: GPLv3 except where otherwise noted
   7
   8 __Contact__: info@karmanotes.org
   9
  10 v3.0 of the karmanotes.org website from the FinalsClub Foundation
  11
  12
  13
  14
  15 # Purpose
  16
  17 KarmaNotes is an online database of college lecture notes.  KarmaNotes empowers college students to participate in the free exchange of knowledge.
  18
  19 # Pre-Installation
  20
  21 ## Code
  22
  23 Before doing anything, you'll need the code. Grab it from github.
  24
  25 Clone the project from the central repo using your github account:
  26
  27     git clone git@github.com:FinalsClub/karmaworld.git
  28
  29 If you aren't using a system setup for github, then grab the project with
  30 this command instead:
  31
  32     git clone https://github.com/FinalsClub/karmaworld.git
  33
  34 Generally speaking, this will create a subdirectory called `karmaworld` under
  35 the directory where the `git` command was run. This git repository directory
  36 will be referred to herein as `{project_root}`.
  37
  38 There might be some confusion as the git repository's directory will likely be
  39 called `karmaworld` (this is `{project_root}`), but there is also a `karmaworld`
  40 directory underneath that (`{project_root}/karmaworld`) alongside files like
  41 `fabfile.py` (`{project_root}/fabfile.py`) and `README.md`
  42 (`{project_root}/README.md`).
  43
  44 ## External Service Dependencies
  45
  46 Notice: This software makes use of external third party services which require
  47 accounts to access the service APIs. Without these third parties available,
  48 this software may require considerable overhaul.
  49
  50 ### Filepicker
  51 This software uses [Filepicker.io](https://www.inkfilepicker.com/) for uploading
  52 files. This requires an account with Filepicker.
  53
  54 Filepicker requires an additional third party file hosting site where it may
  55 send uploaded files. For this project, we have used Amazon S3.
  56
  57 Filepicker will provide an API key. This is needed by the software.
  58
  59 ### Amazon S3
  60
  61 #### for Filepicker
  62 This software uses [Amazon S3](http://aws.amazon.com/s3/) as a third party file
  63 hosting site. The primary use case is a destination for Filepicker files. The
  64 software won't directly need any S3 information for this use case; it will be
  65 provided directly to Filepicker.
  66
  67 #### for Static File hosting
  68 A secondary use case for S3 is hosting static files. The software will need to
  69 update static files on the S3 bucket. In this case, the software will need the
  70 S3 bucket name, access key, and secret key.
  71
  72 The code assumes S3 is used for static files in a production environment. To
  73 obviate the need for hosting static files through S3 (noting that it still might
  74 be necessary for Filepicker), a workaround was explained [in this Github ticket](https://github.com/FinalsClub/karmaworld/issues/192#issuecomment-30193617).
  75
  76 That workaround is repeated here. Make the following changes to
  77 `{project_root}/karmaworld/settings/prod.py`:
  78
  79 1. comment out everything about static_s3 from imports
  80 2. comment out storages from the `INSTALLED_APPS`
  81 3. change `STATIC_URL` to `'/assets/'`
  82 4. comment out the entire storages section (save for part of `INSTALLED_APPS` and `STATIC_URL`)
  83 5. add this to the nginx config:
  84
  85     location /assets/ {
  86         root /var/www/karmaworld/karmaworld/;
  87     }
  88
  89 ### Google Drive
  90 This software uses [Google Drive](https://developers.google.com/drive/) to
  91 convert documents to and from various file formats.
  92
  93 A Google Drive service account with access to the Google Drive is required. Thismay be done with a Google Apps account with administrative privileges, or ask
  94 your business sysadmin.
  95
  96 These are the instructions to create a Google Drive service account:
  97 https://developers.google.com/drive/delegation
  98
  99 When completed, you'll have a file called `client_secrets.json` and a p12 file
 100 which is the key to access the service account. Both are needed by the software.
 101
 102 ### Twitter
 103
 104 Twitter is used to post updates about new courses. Access to the Twitter API
 105 will be required for this task.
 106
 107 If this Twitter feature is desired, the consumer key and secret as well as the
 108 access token key and secret are needed by the software.
 109
 110 If the required files are not found, then no errors will occur.
 111
 112 # Development Install
 113
 114 If you need to setup the project for development, it is highly recommend that
 115 you grab create a development virtual machine or (if available) grab one that
 116 has already been created for your site.
 117
 118 The *host machine* is the system which runs e.g. VirtualBox, while the
 119 *virtual machine* refers to the system running inside e.g. VirtualBox.
 120
 121 ## Creating a Virtual Machine by hand
 122
 123 Create a virtual machine with your favorite VM software. Configure the virtual
 124 machine for production with the steps shown in the [Production Install](#production-install) section.
 125
 126 ## Creating a Virtual Machine with Vagrant
 127
 128 Vagrant supports a variety of virtual machine software and there is additional
 129 support for Vagrant to deploy to a wider variety. However, for these
 130 instructions, it is assumed Vagrant will be deployed to VirtualBox.
 131
 132 1. Configure external dependencies on the host machine:
 133    * Under `{project_root}/karmaworld/secret/`:
 134         1. Copy files with the example extension to the corresponding filename
 135           without the example extension (e.g.
 136           `cp filepicker.py.example filepicker.py`)
 137         1. Modify those files, but ignore `db_settings.py` (Vagrant takes care of that one)
 138         1. Copy the Google Drive service account p12 file to `drive.p12`
 139            (this filename and location may be changed in `drive.py`)
 140         1. Ensure `*.py` in `secret/` are never added to the git repo.
 141            (.gitignore should help warn against taking this action)
 142
 143 1. Install [VirtualBox](http://www.virtualbox.com/)
 144
 145 1. Install [vagrant](http://www.vagrantup.com/) 1.3 or higher
 146
 147 1. Use Vagrant to create the virtual machine.
 148    * While in `cd {project_root}`, type `vagrant up`
 149
 150 1. Connect to the virtual machine with `vagrant ssh`
 151
 152 Note:
 153 Port 80 of the virtual machine will be configured as port 6659 on the host
 154 system. While on the host system, fire up your favorite browser and point it at
 155 `http://localhost:6659/`. This connects to your host system on port 6659, which
 156 forwards to your virtual machine's web site.
 157
 158 ## Completing the Virtual Machine with Fabric
 159
 160 1. On the virtual machine, type `cd karmanotes` to get into the code repository.
 161
 162 1. In the code repo of the VM, type `fab -H 127.0.0.1 first_deploy`
 163
 164    During this process, you will be queried to create a Django site admin.
 165    Provide information. You will be asked to remove duplicate schools. Respond
 166    with yes.
 167
 168 # Production Install
 169
 170 These steps are taken care of by automatic utilities. Vagrant performs the
 171 first subsection of these instructions and Fabric performs the second
 172 subsection. These instructions are detailed here for good measure, but should
 173 not generally be needed.
 174
 175 1. Ensure the following are installed:
 176    * `git`
 177    * `7zip` (for unzipping US Department of Education files)
 178    * `PostgreSQL` (server and client)
 179    * `nginx`
 180    * `libxslt` and `libxml2` (used by some Python libraries)
 181    * `RabbitMQ` (server)
 182    * `memcached`
 183    * `Python`
 184    * `PIP`
 185    * `virtualenv`
 186    * `virtualenvwrapper` (might not be needed anymore)
 187
 188    On a Debian system supporting Apt, this can be done with:
 189
 190        sudo apt-get install python-pip postgresql python-virtualenv \
 191                             virtualenvwrapper git nginx p7zip-full \
 192                             postgresql-server-dev-9.1 libxslt1-dev libxml2-dev \
 193                             libmemcached-dev python-dev rabbitmq-server
 194
 195 1. Generate a PostgreSQL database and a role with read/write permissions.
 196    * For Debian, these instructions are helpful: https://wiki.debian.org/PostgreSql
 197
 198 1. Modify configuration files.
 199    * There are settings in `{project_root}/karmaworld/settings/prod.py`
 200        * Most of the setting should work fine by default.
 201    * There are additional configuration options for external dependencies
 202      under `{project_root}/karmaworld/secret/`.
 203         1. Copy files with the example extension to the corresponding filename
 204           without the example extension (e.g.
 205           `cp filepicker.py.example filepicker.py`)
 206         1. Modify those files.
 207            * Ensure `PROD_DB_USERNAME`, `PROD_DB_PASSWORD`, and `PROD_DB_NAME`
 208              inside `db_settings.py` match the role, password, and database
 209              generated in the previous step.
 210         1. Copy the Google Drive service account p12 file to `drive.p12`
 211            (this filename and location may be changed in `drive.py`)
 212         1. Ensure `*.py` in `secret/` are never added to the git repo.
 213            (.gitignore should help warn against taking this action)
 214
 215 1. Make sure that /var/www exists, is owned by the www-data group, and that
 216    the desired user is a member of the www-data group.
 217
 218 1. Configure nginx with a `proxy_pass` to port 8000 (or whatever port gunicorn
 219    will be running the site on) and any virtual hosting that is desired.
 220    Here is an example server file to put into `/etc/nginx/sites-available/`
 221
 222         server {
 223             listen 80;
 224             # don't do virtual hosting, handle all requests regardless of header
 225             server_name "";
 226             client_max_body_size 20M;
 227
 228             location / {
 229                 # pass traffic through to gunicorn
 230                 proxy_pass http://127.0.0.1:8000;
 231             }
 232         }
 233
 234 1. Configure the system to start supervisor on boot. An init script for
 235    supervisor is in the repo at `{project_root}/karmaworld/confs/supervisor`.
 236    `update-rc.d supervisor defaults` is the Debian command to load the init
 237    script into the correct directories.
 238
 239 1. Make sure `{project_root)/var/log` and `{project_root}/var/run` exist and
 240    may be written to, or else put the desired logging and run file paths into
 241    `{project_root}/confs/prod/supervisord.conf`
 242
 243 1. Create a virtualenv under `/var/www/karmaworld/venv`
 244
 245 1. Change into the virtualenv with `. /var/www/karmaworld/venv/bin/activate`.
 246    Within the virtualenv:
 247
 248     1. Update the Python depenencies with `pip -i {project_root}/reqs/prod.txt`
 249
 250     1. Setup the database with `python {project_root}/manage.py syncdb --migrate`
 251
 252     1. Collect static resources and put them in the static hosting location with
 253        `python {project_root}/manage.py collect_static`
 254
 255 1. The database needs to be populated with schools. A list of accredited schools
 256    may be found on the US Department of Education website:
 257    http://ope.ed.gov/accreditation/GetDownloadFile.aspx
 258
 259    Alternatively, use the built-in scripts while in the virtualenv:
 260
 261    1. Fetch USDE schools with
 262       `python {project_root}/manage.py fetch_usde_csv ./schools.csv`
 263
 264    1. Upload the schools into the database with
 265       `python {project_root}/manage.py import_usde _csv ./schools.csv`
 266
 267    1. Clean up redundant information with
 268       `python {project_root}/manage.py sanitize_usde_schools`
 269
 270 1. Startup `supervisor`, which will run `celery` and `gunicorn`. This may be
 271    done from within the virtualenv by typing
 272    `python {project_root}/manage.py start_supervisord`
 273
 274 1. If everything went well, gunicorn should be running the website on port 8000
 275    and nginx should be serving gunicorn on port 80.
 276
 277 # Accessing the Vagrant Virtual Machine
 278
 279 ## Connecting to the VM via SSH
 280 If you have installed a virtual machine using `vagrant up`, you can connect
 281 to it by running `vagrant ssh` from `{project_root}`.
 282
 283 ## Connecting to the development website on the VM
 284 To access the website running on the VM, point your browser at
 285 http://localhost:6659/ using your host computer.
 286
 287 Port 6659 on your local machine is set to forward to the VM's port 80.
 288
 289 Fun fact: 6659 was chosen because of OM (sanskrit) and KW (KarmaWorld) on a
 290 phone: 66 59.
 291
 292 ## Updating the VM code repository
 293 Once connected to the virtual machine by SSH, you will see `karmaworld` in
 294 the home directory. That is the `{project_root}` in the virtual machine.
 295
 296 `cd karmaworld` and then use `git fetch; git merge` and/or `git pull origin` as
 297 desired.
 298
 299 The virtual machine's code repository is set to use your host machine's
 300 local repository as the origin. So if you make changes locally and commit them,
 301 without pushing them anywhere, your VM can pull those changes in for testing.
 302
 303 This may seem like duplication. It is. The duplication allows your host machine
 304 to maintain git credentials and manage repository access control so that your
 305 virtual machine doesn't need sensitive information. Your virtual machine simply
 306 pulls from the local repository on your local file system without needing
 307 credentials, etc.
 308
 309 ## Other Vagrant commands
 310 Please see [vagrant documentation](http://docs.vagrantup.com/v2/cli/index.html)
 311 for more information on how to use the vagrant CLI to manage your development
 312 VM.
 313
 314 # Django Database management
 315
 316 ## South
 317
 318 We have setup Django to use
 319 [south](http://south.aeracode.org/wiki/QuickStartGuide) for migrations. When
 320 changing models, it is important to run
 321 `python {project_root}/manage.py schemamigration` which will create a migration
 322  to reflect the model changes into the database. These changes can be pulled
 323 into the database with `python {project_root}/manage.py migrate`.
 324
 325 Sometimes the database already has a migration performed on it, but that
 326 information wasn't told to south. There are subtleties to the process which
 327 require looking at the south docs. As a tip, start by looking at the `--fake`
 328 flag.
 329
 330 # Assets from Third Parties
 331
 332 A number of assets have been added to the repository which come from external
 333 sources. It would be difficult to keep a complete list in this README and keep
 334 it up to date. Software which originally came from outside parties can
 335 generally be found in `{project_root}/karmaworld/assets`.
 336
 337 Additionally, all third party Python projects (downloaded and installed with
 338 pip) are listed in these files:
 339
 340 * `{project_root}/reqs/common.txt`
 341 * `{project_root}/reqs/dev.txt`
 342 * `{project_root}/reqs/prod.txt`
 343
 344 # Thanks
 345
 346 * KarmaNotes.org is a project of the FinalsClub Foundation with generous funding from the William and Flora Hewlett Foundation
 347
 348 * Also thanks to [rdegges](https://github.com/rdegges/django-skel) for the django-skel template