Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching on custom fields #1190

Closed
candlerb opened this issue May 17, 2017 · 14 comments
Closed

Searching on custom fields #1190

candlerb opened this issue May 17, 2017 · 14 comments
Labels
pending closure Requires immediate attention to avoid being closed for inactivity status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation type: feature Introduction of new functionality to the application

Comments

@candlerb
Copy link
Contributor

Issue type:

Feature request

Two issues with searching over custom field values.

  1. If you have defined a custom string field on IP Address, and then use the specific search page (/ipam/ip-addresses/) you are presented with a search box for that field. However it only matches if you type the full value of the custom field; it doesn't do prefix matching.

  2. It would also be useful if global search could also search over custom fields, which could be as simple as:

select obj_type_id, obj_id from extras_customfieldvalue where serialized_value like 'foo%';

I realise that enumerated values would cause a problem, but just matching string-valued custom fields would be fine for me.

@jeremystretch jeremystretch added the type: feature Introduction of new functionality to the application label Jun 1, 2017
@jeremystretch
Copy link
Member

The first one is an easy fix.

The second one is much more involved. The current global search logic simply loops through the requested object types (which is all of them by default) and performs a separate query for each model. Each model with results is then displayed as its own table with related data as appropriate.

If you'd like to take a shot at implementing this using the Django ORM I'll keep this issue open, but otherwise I'm not sure it's worth the complexity and overhead to implement.

@JNR8
Copy link

JNR8 commented Jun 2, 2017

I was going to log this exact request.

We use custom fields on IP addresses to record DNS names associated with that IP. Mainnly for external IP addresses. The logic behind it is that you could enter a domain name and get all the IP addresses associated with that domain name and its sub domains. But this does not currently work. I had thought that because I had not set the custom field as filterable it did not show. but alas this is not the case.

If its possible to work your magic in to get cutom field value returned when searching that would be very helpful to us.

Thanks.

@candlerb
Copy link
Contributor Author

candlerb commented Jun 4, 2017

The second one is much more involved. The current global search logic simply loops through the requested object types (which is all of them by default) and performs a separate query for each model. Each model with results is then displayed as its own table with related data as appropriate.

If you'd like to take a shot at implementing this using the Django ORM I'll keep this issue open

I'm no expert in Django ORM, but it looks like there are a couple of ways to do this.

I believe the global search code is here in netbox/views.py:

            for obj_type in obj_types:

                queryset = SEARCH_TYPES[obj_type]['queryset']
                filter_cls = SEARCH_TYPES[obj_type]['filter']
                table = SEARCH_TYPES[obj_type]['table']
                url = SEARCH_TYPES[obj_type]['url']

                # Construct the results table for this object type
                filtered_queryset = filter_cls({'q': form.cleaned_data['q']}, queryset=queryset).qs
                table = table(filtered_queryset)
                table.paginate(per_page=SEARCH_MAX_RESULTS)

Options:

(1) For each of those querysets, do a union query to add the objects with given custom field value. For example, when you're looking for IPAddress objects, also do this query:

root@netbox:/opt/netbox/netbox# ./manage.py shell
...
>>> from ipam.models import IPAddress
>>> IPAddress.objects.filter(custom_field_values__serialized_value__startswith='ix-syslog')
<QuerySet [<IPAddress: 192.0.2.37>, <IPAddress: 2001:db8:0:2::37>]>

Obtaining the union of two querysets looks to be straightforward. Then you pass that to table() as now.

(2) Up-front, do a single query which gets all the objects which match the given custom field value

root@netbox:/opt/netbox/netbox# ./manage.py shell
...
>>> from extras.models import CustomFieldValue
>>> custom_results = [v.obj for v in CustomFieldValue.objects.filter(serialized_value__startswith='ix-syslog')]
>>> custom_results
[<IPAddress: 192.0.2.37>, <IPAddress: 2001:db8:0:2::37>]
>>>

In general, there could be a mix of different object classes in custom_results.

Then, when you do queries for each of the different object types, you can add in the relevant objects from custom_results

I know nothing about either the queryset or table objects, but it seems to me you want to do something like this (pseudo-code):

        table = table(filtered_queryset)
        table.append([r for r in custom_results if isinstance(r, obj_type)])

Or maybe it's possible to construct a QuerySet out of custom_results and union it into the first queryset.

lampwins pushed a commit to lampwins/netbox that referenced this issue Oct 13, 2017
@jeremystretch jeremystretch added the status: accepted This issue has been accepted for implementation label Jan 26, 2018
@arionl
Copy link

arionl commented Feb 1, 2018

Being able to search globally on custom fields would really helpful. I'm using a couple custom fields that apply both to Devices and Virtual Machines and at this point there is no way to get all results without doing two different searches.. While Devices and Virtual Machines have any disparate attributes, they are both fundamentally nodes-on-a-network and finding a way to do a "Filter" view that shows the common attributes of those two object sets (including Custom Fields) would be extremely handy..

@jdell64
Copy link

jdell64 commented Feb 3, 2018

What about creating a search object? That search object can run your search query and perform the necessary joins... so, in your case it would run the two searches and map reduce for you.

@arionl
Copy link

arionl commented Feb 12, 2018

@jdell64 not sure if you were referencing my comment above, but if so, how would I go about creating the custom search object? To recap my use case, I'm using a custom field to identify if a Device or Virtual Machine should be actively monitored by a separate monitoring platform (simple true/false). It would be very handy to filter on this attribute through a global search form rather than going to Devices -> Devices -> Search and Virtualization -> Virtual Machines -> Search. If you could point me in the right direction it would be greatly appreciated.

@denogio
Copy link

denogio commented Feb 22, 2018

Searching globally in custom fields would really be helpful IMO. We have some custom_fields for all our servers and vm and it would really be helpful to be able to search globally in these fields. Would really lighten our workflow.

@jdell64
Copy link

jdell64 commented Feb 22, 2018

@arionl sorry, that was a speculation of a possible solution that would have to be developed first.

(Edit by jstretch: This comment got posted 5 times somehow, deleted the other 4.)

@jeremystretch
Copy link
Member

Please stop asking for updates. If there's an update, it will appear here.

@jeremystretch
Copy link
Member

Blocked by #4878

@jeremystretch jeremystretch added status: blocked Another issue or external requirement is preventing implementation and removed status: accepted This issue has been accepted for implementation labels Jul 24, 2020
@jeremystretch jeremystretch added status: accepted This issue has been accepted for implementation and removed status: blocked Another issue or external requirement is preventing implementation labels Nov 18, 2020
@jeremystretch jeremystretch added this to the v2.11 milestone Nov 18, 2020
@jeremystretch
Copy link
Member

With the v2.10 release, custom field data is now stored locally on each model instance as JSON. Marking this as needs owner for anyone who would like to take on this work. Please see the Django documentation for instructions on querying JSONField data.

@jeremystretch jeremystretch added status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation and removed status: accepted This issue has been accepted for implementation labels Dec 21, 2020
@jeremystretch jeremystretch removed this from the v2.11 milestone Dec 21, 2020
@candlerb
Copy link
Contributor Author

candlerb commented Jan 5, 2021

The main problem is efficient search without doing a full table scan of every single table in the system - especially if you want to search for substrings of any custom field value.

As a first approximation, you could just do a full-text search on the entire JSON blob:

-- https://www.postgresql.org/docs/9.6/textsearch-tables.html
DROP INDEX IF EXISTS dcim_device_cfd;
CREATE INDEX dcim_device_cfd on dcim_device USING GIN
(to_tsvector('english', custom_field_data::text));

-- Doing the search
EXPLAIN
select id,name from dcim_device
where to_tsvector('english', custom_field_data::text) @@ to_tsquery('english', 'fox');

However, the EXPLAIN shows a sequential scan, and I don't know why. Did I do something wrong, or is it just that the table is too small?

A more advanced implementation would index only the values and not the keys from the JSON, and ignore integer and boolean values. I knocked up the following, and also tried using a trigram index:

-- inspired by https://www.postgresql.org/message-id/CAONrwUFOtnR909gs+7UOdQQB12+pXsGUYu5YHPtbQk5vaE9Gaw@mail.gmail.com
CREATE OR REPLACE FUNCTION jsonb_string_values(data jsonb) RETURNS text[] AS
$$
  select array_agg(value->>0)::text[] as value from jsonb_each(data) where jsonb_typeof(value) = 'string';
$$ LANGUAGE SQL IMMUTABLE;

-- Trigram index requires single string
CREATE OR REPLACE FUNCTION jsonb_string_values_concat(data jsonb) RETURNS text AS
$$
  select array_to_string(jsonb_string_values(data), '|');
$$ LANGUAGE SQL IMMUTABLE;

CREATE EXTENSION IF NOT EXISTS pg_trgm;
DROP INDEX IF EXISTS dcim_device_cfv;
CREATE INDEX dcim_device_cfv on dcim_device USING GIN
(jsonb_string_values_concat(custom_field_data) gin_trgm_ops);

-- Doing the search
EXPLAIN
select id,name from dcim_device
where jsonb_string_values_concat(custom_field_data) like '%fox%';

Again the EXPLAIN shows a full sequential scan.

EDIT: for test purposes do set enable_seqscan=off; to force use of the indexes even when Postgres would rather not.

@stale
Copy link

stale bot commented Feb 21, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. NetBox is governed by a small group of core maintainers which means not all opened issues may receive direct feedback. Please see our contributing guide.

@stale stale bot added the pending closure Requires immediate attention to avoid being closed for inactivity label Feb 21, 2021
@netbox-community netbox-community deleted a comment from fabi125 Feb 22, 2021
@stale
Copy link

stale bot commented Mar 19, 2021

This issue has been automatically closed due to lack of activity. In an effort to reduce noise, please do not comment any further. Note that the core maintainers may elect to reopen this issue at a later date if deemed necessary.

@stale stale bot closed this as completed Mar 19, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pending closure Requires immediate attention to avoid being closed for inactivity status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation type: feature Introduction of new functionality to the application
Projects
None yet
Development

No branches or pull requests

6 participants