Skip to content

Commit

Permalink
Pre-release merge for search v0.5 (#221)
Browse files Browse the repository at this point in the history
* ARXIVNG-281 moved Kinesis BaseConsumer to arxiv-base
* ARXIVNG-1177 prototype API
*  ARXIVNG-1177 updating openapi and jsonschema
* ARXIVNG-1177 working on serialization
* ARXIVNG-1177 added tests using JSON schema
* ARXIVNG-1177 added auth, more tests
* ARXIVNG-1206 added support for configurable return fields; query by URI.
* ARXIVNG-1348 updated requests to >= 2.20.0
* ARXIVNG-1347 added cross list search option in advanced; ARXIVNG-1278 all-fields includes cross-list search
* ARXIVNG-1277 add styling for secondary search catgories
* ARXIVNG-1349 change layout for search results to better align tags and DOI tags
* ARXIVNG-1349 rearrange header in advanced search take 1
* ARXIVNG-1349 tweak margins and layout for tabletet/mobile
* ARXIVNG-1363 upgraded requests version; ARXIVNG-1362 added custom user agent
* ARXIVNG-1357 upgraded secondary classification mapping to be consistent with primary
* supporting cross-list classification in high-level filtering, per #209
* ARXIVNG-1448 support for primary, ARXIVNG-1447 secondary categories
* ARXIVNG-1223 query parameters are included in response metadata
* ARXIVNG-1349 mobile styling fix ffor results
  • Loading branch information
erickpeirson authored and mhl10 committed Dec 20, 2018
1 parent 7f5a0eb commit 63bf5be
Show file tree
Hide file tree
Showing 135 changed files with 3,781 additions and 1,460 deletions.
11 changes: 11 additions & 0 deletions DECISIONS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Decision log

## Initial design decisions - v0.1-0.4

1. To get started quickly, we will start with an AWS Elasticsearch managed
cluster running in the cloud. We may wish to run our own cluster in the
future.
Expand Down Expand Up @@ -32,3 +34,12 @@
results, and we are only seeking feature-parity with the classic system.
When we address hit highlighting, we can show matching author names deep in
author list to provide visual feedback to the user.

## Subsequent decisions

- 2018-12-18. Removing cross-list functionality in v0.1 was a regression. Users
expect to be able to search by cross-list category just like primary
category. We decided to include cross-list/secondary category in the
all-fields search, add a cross-list field to the advanced search interface,
and include cross-list classification in shortcut routes and the advanced
interface's classification filter (with option to exclude).
6 changes: 4 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@
# Defines the runtime for the arXiv search service, which provides the main
# UIs (and, eventually, APIs) for search.

FROM arxiv/base:0.6.1
FROM arxiv/base:0.12.1

WORKDIR /opt/arxiv

# Install MySQL.
RUN yum install -y which mysql mysql-devel

# Add Python application and configuration.
ADD requirements/prod.txt /opt/arxiv/requirements.txt
ADD app.py /opt/arxiv/
ADD Pipfile /opt/arxiv/
ADD Pipfile.lock /opt/arxiv/
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-agent
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# article metadata becomes available. Subscribes to a Kinesis stream for
# notifications about new metadata.

FROM arxiv/search:0.4
FROM arxiv/search:0.5.1

WORKDIR /opt/arxiv

Expand Down
59 changes: 59 additions & 0 deletions Dockerfile-api
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# arxiv/search-api
#
# Defines the runtime for the arXiv search API, which provides a metadata
# query API backed by Elasticsearch.

FROM arxiv/base:0.12.1

WORKDIR /opt/arxiv

# Install MySQL.
RUN yum install -y which mysql mysql-devel

# Add Python application and configuration.
ADD app.py /opt/arxiv/
ADD Pipfile /opt/arxiv/
ADD Pipfile.lock /opt/arxiv/
RUN pip install -U pip pipenv
RUN pipenv install

ENV PATH "/opt/arxiv:${PATH}"

ADD schema /opt/arxiv/schema
ADD mappings /opt/arxiv/mappings
ADD search /opt/arxiv/search
ADD wsgi-api.py /opt/arxiv/wsgi.py
RUN pip install uwsgi

ADD bin/start_search.sh /opt/arxiv/
RUN chmod +x /opt/arxiv/start_search.sh

ENV LC_ALL en_US.utf8
ENV LANG en_US.utf8
ENV LOGLEVEL 40
ENV FLASK_DEBUG 1
ENV FLASK_APP /opt/arxiv/app.py

ENV ELASTICSEARCH_SERVICE_HOST 127.0.0.1
ENV ELASTICSEARCH_SERVICE_PORT 9200
ENV ELASTICSEARCH_PORT_9200_PROTO http
ENV ELASTICSEARCH_INDEX arxiv
ENV ELASTICSEARCH_USER elastic
ENV ELASTICSEARCH_PASSWORD changeme
ENV METADATA_ENDPOINT https://arxiv.org/docmeta_bulk/

EXPOSE 8000

#CMD /bin/bash
ENTRYPOINT ["/opt/arxiv/start_search.sh"]
CMD ["--http-socket", ":8000", \
"-M", \
"-t 3000", \
"--manage-script-name", \
"--processes", "8", \
"--threads", "1", \
"--async", "100", \
"--ugreen", \
"--buffer-size", "65535", \
"--mount", "/metadata=wsgi.py", \
"--logformat", "%(addr) %(addr) - %(user_id)|%(session_id) [%(rtime)] [%(uagent)] \"%(method) %(uri) %(proto)\" %(status) %(size) %(micros) %(ttfb)"]
13 changes: 1 addition & 12 deletions Dockerfile-index
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,9 @@
#
# See also ELASTICSEARCH_* and METADATA_ENDPOINT parameters, below.

FROM arxiv/base

# Add Python consumer and configuration.
ADD requirements/prod.txt /opt/arxiv/requirements.txt
ADD app.py /opt/arxiv/
RUN pip install -U pip
RUN pip install -r /opt/arxiv/requirements.txt
FROM arxiv/search:0.5.1

ENV PATH "/opt/arxiv:${PATH}"

ADD schema /opt/arxiv/schema
ADD mappings /opt/arxiv/mappings
ADD search /opt/arxiv/search
ADD tests /opt/arxiv/tests
ADD bulk_index.py /opt/arxiv/

WORKDIR /opt/arxiv/
Expand Down
20 changes: 10 additions & 10 deletions Pipfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
[[source]]

url = "https://pypi.python.org/simple"
verify_ssl = true
name = "pypi"


[packages]

arxiv-base = "==0.6.1"
arxiv-auth = "==0.2.3"
arxiv-base = "==0.12.1"
boto = "==2.48.0"
"boto3" = "==1.6.6"
botocore = "==1.9.6"
Expand All @@ -18,8 +16,8 @@ coverage = "==4.4.2"
dataclasses = "==0.4"
docutils = "==0.14"
elasticsearch = "==6.2.0"
elasticsearch-dsl = "==6.1.0"
flask = "==0.12.2"
elasticsearch-dsl = "==6.3.1"
flask = "==1.0.2"
"flask-s3" = "==0.3.3"
idna = "==2.6"
ipaddress = "==1.0.19"
Expand All @@ -40,18 +38,20 @@ pyflakes = "==1.6.0"
pylama = "==7.4.3"
python-dateutil = "==2.6.1"
pytz = "==2017.3"
requests = "==2.18.4"
requests = "==2.20.0"
"s3transfer" = "==0.1.13"
snowballstemmer = "==1.2.1"
thrift = "==0.11.0"
thrift-connector = "==0.23"
typed-ast = "==1.1.0"
"urllib3" = "==1.22"
werkzeug = "==0.13"
werkzeug = "==0.14.1"
wtforms = "==2.1"
bleach = "*"

lxml = "*"

[dev-packages]

coveralls = "*"
sphinx = "*"
sphinxcontrib-websupport = "*"
sphinx-autodoc-typehints = "*"
Loading

0 comments on commit 63bf5be

Please sign in to comment.