README.MD 21.6 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Document Deposit Assistant
The *Document Deposit Assistant* (DDA) is a web application which is able to import massive amounts of content and their metadata from a variety of data sources into a target repository.
To accomplish this aim, DDA consists of two complementary services:
* a web-based interface, where content providers (e.g. publishers, libraries, repository managers) are guided through a wizard, answering easily understandable questions to their content management infrastructure (e.g. which software they use, such as the DSpace institutional application).
* a service which uses the answers elicited from the wizard in order to connect to the content management infrastructure or process uploaded datadumps, harmonize metadata, and finally import it into the target repository.

## DDA and DSpace
Currently, one DDA installation supports one target DSpace 5+ repository installation. As a DDA installation is interacting with its target repository via REST, both can be deployed and restarted independently.

When a content provider has successfully used DDA to import a batch of content to the target DSpace repository, it will land in that collection's  *XMLWorkflow* task pool, where that collection's editors and reviewers will have the chance to do their usual business of validating and improving the each submission before archiving (or rejecting) it.

# Initial setup
DDA is currently focused on working with a DSpace 5+ installation. In particular, it requires a running [DSpace REST endpoint](https://wiki.duraspace.org/display/DSDOC5x/REST+API) with [additional endpoints](https://git.gesis.org/dspace/rest-additions). DSpace must be running with [XMLWorkflow](https://wiki.duraspace.org/display/DSDOC5x/Configurable+Workflow).

## Creating a *Document Deposit Assistant* user
DDA will import documents to DSpace as a registered DSpace user. To create a new DDA user account within DSpace, first log in with administrator privileges. Then select *Access Control* -> *People*. Click *Click here to add a new E-Person*. Provide a valid and unique e-mail address, provide as first name "Document", as last name "Deposit Assistant", and have "Can Log In" selected. Click *Create E-Person*. Back in the *E-person management* interface, search for e-people with a string "Deposit Assistant", select the correct *Document Deposit Assistant* e-person from the results, and click *Login as E-Person* (in case it's available) or *Reset Password* in order to provide this user a password.


## Creating a *Document Deposit Assistant* collection
DDA needs to know about a DSpace *collection* to which it can import its processed new items to.

In your DSpace installation, we suggest to create a new DSpace collection exclusively for DDA imports. This allows you to wipe all DDA-supplied imports in case something went wrong. While being logged in as a DSpace administrator, click on *Browse* -> *Communities & Collections* in order to get the *community list* overview. Either create a new community or select a community which you want the *Document Deposit Assistant* collection be part of, and click *create Collection*. Provide a meaningful name such as *Document Deposit Assistant* and click *Create*.
You will get into the *Edit Collection* dialog. On the *Assign Roles* tab, within the *submitters* section, click *Create...*. This will create a new group which is granted submitter rights to this collection; and you will be brought to the membership dialog for this group. Within this dialog, have a look at the headline. It should be of the form: `Group Editor: COLLECTION_XXX_SUBMIT (id: YYY)`. Keep note of the `XXX` part, as this is the collection *ID* (not collection *handle*) that we will require later. On this submitter group membership dialog, search for e-people with a string "Deposit Assistant", identify the correct *Document Deposit Assistant* e-person from the results, click on its *Add* button, and click *Save* to finalize this step.



#Technical overview
28
29
30
31
32
33
34
35

This application was generated using JHipster, you can find documentation and help at [https://jhipster.github.io](https://jhipster.github.io).

Before you can build this project, you must install and configure the following dependencies on your machine:

1. [Node.js][]: We use Node to run a development web server and build the project.
   Depending on your system, you can install Node either from source or as a pre-packaged bundle.

36
37
38
39
DDA was generated with jHipster 2.27.1. If you want to extend the DDA source code base with additional jHipster entities, you will need exactly that version:

    npm install -g yo
    npm install -g generator-jhipster@2.27.1
40
41
42

We use [Grunt][] as our build system. Install the grunt command-line tool globally with:

43
    npm install -g grunt-cli bower
44

45
46
47
48
49
After installing Node, you should be able to run the following command to install development tools (like
[Bower][] and [BrowserSync][]). You will only need to run this command when dependencies change in package.json.

    npm install

50
51
52
53
54
55
56
57
58
59
Run the following commands in two separate terminals to create a blissful development experience where your browser
auto-refreshes when files change on your hard drive.

    mvn
    grunt

Bower is used to manage CSS and JavaScript dependencies used in this application. You can upgrade dependencies by
specifying a newer version in `bower.json`. You can also run `bower update` and `bower install` to manage dependencies.
Add the `-h` flag on any command to see how you can use it. For example, `bower update -h`.

Gerrit Hübbers's avatar
Gerrit Hübbers committed
60
61
62
63
64
65
66
67
## Staging
### Initial setup
The staging environment shall be as close as possible to the production environment. DDA uses MySQL in its staging environment. Assuming a MySQL server is running, and the `mysql` client tool exists, run the following commands in order to set up the DDA MySQL database in a state as expected by DDA's `staging` profile:
```
mysql --user=root --password --host=localhost --port=3306 --protocol=TCP --verbose --execute="create database if not exists dda character set utf8 collate utf8_general_ci;"
mysql --user=root --password --host=localhost --port=3306 --protocol=TCP --verbose --execute="create user 'dda'@'localhost' identified by 'dda';"
mysql --user=root --password --host=localhost --port=3306 --protocol=TCP --verbose --execute="grant all privileges on dda.* to 'dda'@'localhost'; flush privileges;"
```
68
69
70
71
### Debugging staging environment
The staging environment is set up in such a way that it allows connecting a remote Java debugger (see file `etc/dda-wizard-staging.conf)`. You can connect to it like so:
* First, ssh tunnel port-forward the remote debugger port with `ssh -L 8002:localhost:8002 svko-dda-test.gesis.intra`.
* Then, from Eclipse, create a new debug configuration with parameters `localhost` and port `8002`. Click connect.
Gerrit Hübbers's avatar
Gerrit Hübbers committed
72
73
74
75

### Running DDA with the `staging` profile on a development machine
To build a staging version on your development machine, run `mvn package -Pstaging -DskipTests=true`. To run this staging version on your development machine, run `java -jar target/dda-wizard.war --spring.profiles.active=staging`.

76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# Building for production

To optimize the DDA client for production, run:

    mvn -Pprod clean package

This will concatenate and minify CSS and JavaScript files. It will also modify `index.html` so it references
these new files.

To ensure everything worked, run:

    java -jar target/*.war --spring.profiles.active=prod

Then navigate to [http://localhost:8080](http://localhost:8080) in your browser.

Gerrit Hübbers's avatar
Gerrit Hübbers committed
91
## Testing
92
93
94
95
96

Unit tests are run by [Karma][] and written with [Jasmine][]. They're located in `src/test/javascript` and can be run with:

    grunt test

Gerrit Hübbers's avatar
Gerrit Hübbers committed
97
98
99
## Development
### Development methodology
##### Fixing bugs and building features on dedicated branches
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
It is a best practice to fix a bug and develop a new feature on a dedicated git branch, then, after finishing that task, merging the made changes back into the *master* branch.
* For the whole development group, this helps in maintaining a working DDA Wizard version in the *master* branch - it will never contain a half-baked version.
* For the individual developer(s) working on the branch, it helps to develop on their task with a known DDA Wizard git project state, and changes on the *master* made concurrently by others won't interfere with their work.
* The finalizing merge into master allows to see the *set of changes* made to the whole DDA Wizard git project that are required to show *what* had to be changed in order to fulfill the feature/bugfix.

Follow the following steps in order to work with branches:

```
cd ~/git/dda-wizard/

git checkout master

# get the current DDA Wizard repository state into your local repository
git pull

# create a new FEATURE or BUGFIX branch and give it a meaningful name
git checkout -b FEATURE-fancy-feature

# ... make modifications on this branch FEATURE-fancy-feature
# commit these changes,
# and in case a work-in-progress at the end of the day leaves your branch in an inconsistent, nonworking state,
# then add a 'WIP' work in progress prefix for references.
git add X Y Z
git commit -m "WIP foo"

# save those changes also on the upstream branch
git push
# maybe on first branch push, git will ask you to set the upstream branch... 
# ... in that case, just copy and paste the set-upstream command as provided by git

# make some more edits, adds and commits on the local branch...

# you now think you have finished all work on this branch, git push your branch a final time ...
# DDA Wizard's Jenkins will deploy your branch to dda-wizard.svko-dda-test.gesis.intra ...
# Have all feature/bugfix stakeholders (e.g. Agathe) play with the svko-dda-test instance and give you feedback

# Assuming now that you and all others are happy with what this branch provides, merge that branch into master ...
# First you checkout your local master branch
git checkout master

# fetch and merge latest origin/master commits into your local master branch:
git pull

# now local master branch is up-to-date

# now merge local FEATURE-fancy-feature into your local master:
git merge --no-ff FEATURE-fancy-feature
# in case of merge conflicts, resolve the conflicts (hint: `git mergetool`)
git commit # that's right, don't provide a commit message. Git will generate one for you.

# assuming "merge --no-ff ..." worked, push this commit to remote repository...

git push

# make a final quality assurance test on svko-dda-test, and make sure that both your new changes and all previously developed features and bugfixes work smoothly together...
```
156

Gerrit Hübbers's avatar
Gerrit Hübbers committed
157
### In-memory database
158
159
You can interact with the h2 in-memory database by visiting its web interface at [http://localhost:8080/h2-console](http://localhost:8080/h2-console). As *JDBC URL*, provide `jdbc:h2:mem:dda`. As *User Name*, provide DDA. Keep *Password* empty.

Gerrit Hübbers's avatar
Gerrit Hübbers committed
160
### Debugging
161
162
The `dev` profile activates Java debugging capability. You can connect a client debugger by pointing it to `localhost:5005`.

Gerrit Hübbers's avatar
Gerrit Hübbers committed
163
### Project source filesystem layout
164
165
166
167
168
169
170
    /       <--- development- and build- relevant files, including this README.MD, pom.xml, package.json, Gruntfile.js ... not part of the final build artifact
    |- src/
        |- main/
            |- java/      <--- dda-wizard Java source files
            |- resources/ <--- in the final build artifact, its content will land in /WEB-INF/classes/. This content won't be served out as files via HTTP.
            |- scss/      <--- Gruntfile.js configures the grunt-sass task to process SASS stylesheets in this directory
            |- webapp/    <--- in the final build artifact, its content will land in /. This content will be served out as files via HTTP!
171

Gerrit Hübbers's avatar
Gerrit Hübbers committed
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
### Build process
DDA is built with Maven. `pom.xml` configures the default `mvn` behavior to run the `spring-boot:run` goal and use the `dev` Maven profile.

#### Building with the default Maven `dev` profile
For the default maven `dev` profile, during the `generate-resources` phase, the `yeoman-maven-plugin` runs the following commands in *this* project's root directory: `npm install && bower install --no-color && grunt sass:server --force`. Let's take a look at each of these frontend-specific build steps:

##### `npm install`
`npm install` investigates file `/package.json` and downloads all (transitive) `dependencies` and `devDependencies` to directory `/node_modules`

##### `bower install --no-color`
`bower install --no-color` investigates file `/bower.json` and sees `appPath` configured to be `src/main/webapp`. Therefore, bower downloads all (transitive) `dependencies` and `devDependencies` to directory `/src/main/webapp/bower_components`

##### `grunt sass:server --force`
`grunt sass:server --force`: grunt interprets file `Gruntfile.js`. It uses [`load-grunt-tasks`](https://github.com/sindresorhus/load-grunt-tasks) to automatically find and register all grunt tasks in /node_modules/* by looking for the default `grunt-*` pattern, including the `sass` task. `Gruntfile.js` configures the `sass` task to have a [target](http://gruntjs.com/api/grunt.task#grunt.task.registermultitask) `sass:server`, which configures `grunt-sass` to find *DDA*'s source Sass stylesheets at `/src/main/scss/`, to find referenced `@imports` in `/src/main/webapp/bower_components/` (using underlying [`node-sass` option `includePaths`](https://github.com/sass/node-sass#options), and to put the generated `.css` output files to `/src/main/webapp/assets/styles/`.

Having this configured, grunt executes this Sass generation.

#### Building with the Maven `staging` profile
For the staging environment DDA is built with the Maven `staging` profile. Let's have a look what is happening when running `mvn clean package -Pstaging`.

The`yeoman-maven-plugin` runs the following commands in *this* project's root directory: `npm install && bower install --no-color && grunt test --no-color && grunt build --no-color`.

##### `grunt test --no-color`
`Gruntfile.js` registers a task `test`. It depends on the following subtasks:
* `clean:server`: this `grunt-contrib-clean` task will delete the [`.tmp` directory](http://stackoverflow.com/q/25621410/923560)
* `wiredep:test`: the `grunt-wiredep` task will update file `src/test/javascript/karma.conf.js` to include all Bower components for Karma tests.
* `ngconstant:dev`: this `grunt-ng-constant` task will create a file `/src/main/webapp/scripts/app/app.constants.js` which acts as an Angular module providing two constants: `ENV=dev` and `VERSION=${POM_VERSION}`.
* `sass:server`. See the discussion earlier in *this* documentation. This task will take all SASS stylesheets from `/src/main/scss/` (and their transitive @import Bower dependencies), convert them to CSS, and place these CSS files into `src/main/webapp/assets/styles/`.
* `karma`: this `grunt-karma` task will use the previously updated configuration file `src/test/javascript/karma.conf.js` to configure Karma JavaScript tests:
  * it loads the following [Karma plugins](https://karma-runner.github.io/1.0/config/plugins.html): `karma-script-launcher, karma-chrome-launcher, karma-html2js-preprocessor, karma-jasmine, karma-requirejs, karma-phantomjs-launcher, karma-coverage, karma-jenkins-reporter`.
  * it activates [Coverage](https://karma-runner.github.io/0.8/config/coverage.html).
  * it uses [Jasmine](https://jasmine.github.io/) as the testing framework
  * it configures as reporters: [`dots`, `progress`](http://stackoverflow.com/a/25601443/923560), [`jenkins` for XML JUnit format reports](https://www.npmjs.com/package/karma-jenkins-reporter), and `Publish JUnit test result report`.
  * it provides to the testing browser all of the following [`files`](http://karma-runner.github.io/1.0/config/files.html): all bower components, all AngularJS frontend files, and almost all files in `/src/test/javascript/**`
`karma-jasmine` will start up the AngularJS application, will additionally set up all Jasmine helpers (located in `/src/test/javascript/spec/helpers/**`) and then run all Jasmine `describe(..)` tests. These tests are located in `/src/test/javascript/spec/**`.

##### `grunt build --no-color`
`Gruntfile.js` registers a task `build`. It depends on the following subtasks:
* `clean:dist`: the `grunt-contrib-clean` task will delete the [`.tmp/`](http://stackoverflow.com/q/25621410/923560) and `/src/main/webapp/dist/` directories.
* `wiredep:app`: this `grunt-wiredep` task will update `/src/main/webapp/index.html` to include Bower JavaScript and Bower CSS dependencies. And will update `/src/main/scss/main.scss` to include Bower SCSS dependencies.
* `ngconstant:prod`: this `grunt-ng-constant` task will create a file `.tmp/scripts/app/app.constants.js` which acts as an Angular module providing two constants: `ENV=prod` and `VERSION=${POM_VERSION}`.
* `useminPrepare`: this [`grunt-usemin`](https://github.com/yeoman/grunt-usemin#the-useminprepare-task) task takes file `/src/main/webapp/index.html` and examines all its `build:js` and `build:css` blocks. It will then dynamically add to the Grunt configuration additional tasks targets, `concat:generated`, `uglifyjs:generated`, `cssmin:generated` and `autoprefixer:generated`.
* `ngtemplates`: this [`grunt-angular-templates`](https://github.com/ericclemmons/grunt-angular-templates) task takes all jHipster- and DDA-specific HTML files and generates a n HTML-minified, JavaScript-based templates file from it to location `/.tmp/templates/templates.js`.
* `sass:server`: See the discussion earlier in *this* documentation. This task will take all SASS stylesheets from `/src/main/scss/` (and their transitive `@import` Bower dependencies), convert them to CSS, and place these CSS files into `src/main/webapp/assets/styles/`.
* `imagemin`: this [`grunt-contrib-imagemin`](https://github.com/gruntjs/grunt-contrib-imagemin) task will take all JPEG images from directory `/src/main/webapp/assets/images/**`, minify them, and copy the results to `/src/main/webapp/dist/assets/images/`
* `svgmin`: this [`grunt-svgmin`](https://github.com/sindresorhus/grunt-svgmin) behaves identical to aforementioned `imagemin` task, but for SVG images.
* `concat`: this [`grunt-contrib-concat`](https://github.com/gruntjs/grunt-contrib-concat) task will execute the previously generated `concat:generated` target (generated by `useminPrepare`). This target bundles all DDA-specific JavaScript files into a single temporary file `/.tmp/concat/scripts/app.js` ... and all Bower JavaScript dependencies in a single temporary file `/.tmp/concat/scripts/vendor.js`.
* `copy:fonts`: this [`grunt-contrib-copy`](https://github.com/gruntjs/grunt-contrib-copy) target copies all Bootstrap fonts to `/src/main/webapp/dist/assets/fonts/`
* `copy:dist`: this `grunt-contrib-copy` target copies from `/src/main/webapp/` all HTML files, all images, and all fonts verbatim to `/src/main/webapp/dist/`.
* `ngAnnotate`: the [grunt-ng-annotate](https://github.com/mgol/grunt-ng-annotate) task allows for expressing [AngularJS dependency annotations](https://docs.angularjs.org/guide/di#dependency-annotation) [differently](https://www.npmjs.com/package/ng-annotate).
* `cssmin`: this [`grunt-contrib-cssmin`](https://github.com/gruntjs/grunt-contrib-cssmin) task will execute the previously generated `cssmin:generated` target (generated by `useminPrepare`). `useminPrepare`'s `/src/main/webapp/index.html` analysis will have `cssmin:generated` take file `/src/main/webapp/assets/styles/main.css` (previously generated during the `sass:server` target), css-minify it, and place it to `/.tmp/cssmin/assets/styles/main.css`... Also, the `index.html` analysis will take all Bower CSS dependencies, concatenate them to one bundle, minify that bundle, and place that minified bundle in and place it to `/.tmp/cssmin/assets/styles/vendor.css`.
* `autoprefixer`: this [grunt-autoprefixer](`https://github.com/nDmitry/grunt-autoprefixer`) task will execute the previously generated `autoprefixer:generated` target (generated by `useminPrepare`). This target takes the outputs from the previous usemin-css step target `cssmin:generated` and prefixes CSS properties with vendor prefixes. As `autoprefixer` is the last step in the usemin CSS pipeline, the final usemin CSS artifacts will land in the paths as specified with `useminPrepare.options.dest + $(build:css-annotations-found-in-index.html)`: `/src/main/webapp/dist/assets/styles/main.css` and `/src/main/webapp/dist/assets/styles/vendor.css`.
* `uglify`: this [`grunt-contrib-uglify`](https://github.com/gruntjs/grunt-contrib-uglify) task will execute the previously generated `uglify(js):generated` target (generated by `useminPrepare`). This target takes the output from the previous usemin-js step target `concat:generated` and uglify-js-minifies it. As `uglify` is the last step in the usemin JS piepline, the final usemin JS artifacts will land in the paths as specified with `useminPrepare.options.dest + $(build:js-annotations-found-in-index.html)`: `/src/main/webapp/dist/scripts/app.js` and `/src/main/webapp/dist/scripts/vendor.js`.
* `rev`: this [grunt-rev](https://github.com/sebdeckers/grunt-rev) task uses *revving* to rename JS, CSS, image, and font files in the `/src/main/webapp/dist/` directory.
* `usemin`: this [grunt-usemin](https://github.com/yeoman/grunt-usemin#the-usemin-task) task investigates all HTML files within the `/src/main/webapp/dist/` directory (previously copied there during the `copy:dist` target execution). The task will find references to unconcatenanted, unrevved assets (JS, CSS, images), then replace these references with the concatenated single-bundles-and-revved filenames.
* `htmlmin`: this [`grunt-contrib-htmlmin`](https://github.com/gruntjs/grunt-contrib-htmlmin) task takes all `/src/main/webapp/dist/*.html` files and html-minifies them in-place.


230
231
232
233
234
235
236
237
[JHipster]: https://jhipster.github.io/
[Node.js]: https://nodejs.org/
[Bower]: http://bower.io/
[Grunt]: http://gruntjs.com/
[BrowserSync]: http://www.browsersync.io/
[Karma]: http://karma-runner.github.io/
[Jasmine]: http://jasmine.github.io/2.0/introduction.html
[Protractor]: https://angular.github.io/protractor/