Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions lucene/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>com.arcadedb</groupId>
<artifactId>arcadedb-parent</artifactId>
<version>25.6.1-SNAPSHOT</version> <!-- Use the current ArcadeDB version -->
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>arcadedb-lucene</artifactId>
<packaging>jar</packaging>
<name>ArcadeDB Lucene</name>
<description>Lucene full-text search engine integration for ArcadeDB.</description>

<properties>
<!-- Define Lucene version, ensure it aligns with arcadedb-engine if not inherited -->
<!-- arcadedb-engine already defines lucene.version as 10.2.1 -->
<!-- spatial4j.version and jts-core.version are also defined in engine's POM -->
<lucene.version>10.2.1</lucene.version>
<spatial4j.version>0.8</spatial4j.version>
<jts-core.version>1.20.0</jts-core.version>
Comment on lines +19 to +25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comments here (lines 20-22) mention that lucene.version, spatial4j.version, and jts-core.version are defined in arcadedb-engine's POM.

If these versions are managed by the parent POM (e.g., in <dependencyManagement> or as inherited properties), redefining them here might be redundant or could unintentionally override the parent's intended versions.

Could you clarify if these properties are truly inherited and managed by the parent POM? If so, it might be cleaner to rely on that management (e.g., by omitting <version> tags for these dependencies if they are in <dependencyManagement>, or ensuring these properties are not re-declared if they are simple inherited properties). If they are intentionally pinned here, the comments could be updated for clarity (e.g., "These versions are pinned for this module and should align with arcadedb-engine's versions.").

</properties>

<dependencies>
<!-- ArcadeDB Core Dependency -->
<dependency>
<groupId>com.arcadedb</groupId>
<artifactId>arcadedb-engine</artifactId>
<version>${project.version}</version>
</dependency>

<!-- Apache Lucene Dependencies -->
<!-- These versions will be managed by the lucene.version property in arcadedb-engine's pom -->
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analysis-common</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queries</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-misc</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-facet</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-memory</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-codecs</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-backward-codecs</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-spatial-extras</artifactId>
<version>${lucene.version}</version>
</dependency>

<!-- Spatial Dependencies -->
<!-- These versions will be managed by properties in arcadedb-engine's pom -->
<dependency>
<groupId>org.locationtech.spatial4j</groupId>
<artifactId>spatial4j</artifactId>
<version>${spatial4j.version}</version>
</dependency>
<dependency>
<groupId>org.locationtech.jts</groupId>
<artifactId>jts-core</artifactId>
<version>${jts-core.version}</version>
</dependency>

<!-- SLF4J for logging -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.36</version> <!-- Specify a common version -->
</dependency>
Comment on lines +108 to +112

Check notice

Code scanning / Meterian

[stability] org.slf4j:[email protected] is outdated Low

org.slf4j:[email protected] is outdated

<!-- Test Dependencies (Optional for now, can be added in Test Migration step) -->
<!--
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<scope>test</scope>
</dependency>
-->
Comment on lines +114 to +126
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test dependencies (JUnit) are currently commented out. The PR checklist also indicates that unit test coverage is not yet confirmed. For a robust module, comprehensive tests are essential. Are there plans to uncomment these and add unit tests soon? This is important for verifying the functionality as it's developed.

</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<addDefaultImplementationEntries>true</addDefaultImplementationEntries>
<addDefaultSpecificationEntries>true</addDefaultSpecificationEntries>
</manifest>
</archive>
</configuration>
</plugin>
<!--
Consider adding maven-shade-plugin here if an uber-jar is desired,
similar to how arcadedb-server and other modules might be packaged.
For now, a regular JAR is fine. Configuration can be adapted from parent POM.
-->
</plugins>
</build>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package com.arcadedb.lucene;

import com.arcadedb.database.DatabaseInternal;
import com.arcadedb.index.IndexFactoryHandler;
import com.arcadedb.index.IndexInternal;
import com.arcadedb.schema.IndexBuilder;
import com.arcadedb.schema.Type;
import com.arcadedb.lucene.index.ArcadeLuceneFullTextIndex;
import java.util.Map;

public class ArcadeLuceneIndexFactoryHandler implements IndexFactoryHandler {

@Override
public IndexInternal create(IndexBuilder builder) {
DatabaseInternal database = builder.getDatabase();
String indexName = builder.getIndexName();
boolean unique = builder.isUnique();
// Schema.INDEX_TYPE indexType = builder.getIndexType(); // This is implicitly "FULL_TEXT" for this handler
Type[] keyTypes = builder.getKeyTypes();
Map<String, String> properties = builder.getProperties();
String filePath = builder.getFilePath();


String analyzerClassName = org.apache.lucene.analysis.standard.StandardAnalyzer.class.getName();
if (properties != null && properties.containsKey("analyzer")) {
analyzerClassName = properties.get("analyzer");
}

// The actual ArcadeLuceneFullTextIndex will need to be instantiated here.
// Its constructor will need to be defined to accept these parameters.
// Adding filePath and keyTypes to the constructor call.
return new com.arcadedb.lucene.index.ArcadeLuceneFullTextIndex(database, indexName, unique, analyzerClassName, filePath, keyTypes);

Check notice on line 32 in lucene/src/main/java/com/arcadedb/lucene/ArcadeLuceneIndexFactoryHandler.java

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

lucene/src/main/java/com/arcadedb/lucene/ArcadeLuceneIndexFactoryHandler.java#L32

Unnecessary use of fully qualified name 'com.arcadedb.lucene.index.ArcadeLuceneFullTextIndex' due to existing import 'com.arcadedb.lucene.index.ArcadeLuceneFullTextIndex'
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
* Copyright 2010-2016 OrientDB LTD (http://orientdb.com)
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.arcadedb.lucene;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

// This class might serve as the main plugin class listed in plugin.json for initialization purposes,
// or handle lifecycle events if ArcadeDB's plugin API expects a specific class for that.
// For now, it's minimal.
public class ArcadeLuceneLifecycleManager {
private static final Logger logger = LoggerFactory.getLogger(ArcadeLuceneLifecycleManager.class);

// This constant might be better placed in ArcadeLuceneIndexFactoryHandler or a shared constants class.
public static final String LUCENE_ALGORITHM = "LUCENE";

public ArcadeLuceneLifecycleManager() {
this(false);
}

public ArcadeLuceneLifecycleManager(boolean manual) {
if (!manual) {
logger.info("ArcadeLuceneLifecycleManager initialized (manual: {}).", manual);
// Further initialization or listener registration logic specific to ArcadeDB's plugin system
// would go here if this class is the entry point.
}
}

// Any necessary lifecycle methods (e.g., from a specific ArcadeDB plugin interface) would be here.
// For now, assuming it does not need to implement DatabaseListener directly.
// Drop logic for indexes of this type should be handled by the Index.drop() method.
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
package com.arcadedb.lucene.analyzer;

import com.arcadedb.common.exception.OException;
import com.arcadedb.common.log.OLogManager;
import com.arcadedb.common.log.OLogger;
import com.arcadedb.database.index.OIndexDefinition;
import com.arcadedb.database.index.OIndexException;
import com.arcadedb.database.metadata.schema.OType;
import com.arcadedb.database.record.impl.ODocument;
import java.lang.reflect.Constructor;
import java.util.Collection;
import java.util.Locale;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.CharArraySet;
import org.apache.lucene.analysis.standard.StandardAnalyzer;

/** Created by frank on 30/10/2015. */
public class OLuceneAnalyzerFactory {
private static final OLogger logger = OLogManager.instance().logger(OLuceneAnalyzerFactory.class);

public Analyzer createAnalyzer(
final OIndexDefinition index, final AnalyzerKind kind, final ODocument metadata) {
if (index == null) {
throw new IllegalArgumentException("Index must not be null");
}
if (kind == null) {
throw new IllegalArgumentException("Analyzer kind must not be null");
}
if (metadata == null) {
throw new IllegalArgumentException("Metadata must not be null");
}
final String defaultAnalyzerFQN = metadata.field("default");
final String prefix = index.getClassName() + ".";

final OLucenePerFieldAnalyzerWrapper analyzer =
geLucenePerFieldPresetAnalyzerWrapperForAllFields(defaultAnalyzerFQN);
setDefaultAnalyzerForRequestedKind(index, kind, metadata, prefix, analyzer);
setSpecializedAnalyzersForEachField(index, kind, metadata, prefix, analyzer);
return analyzer;
}

private OLucenePerFieldAnalyzerWrapper geLucenePerFieldPresetAnalyzerWrapperForAllFields(
final String defaultAnalyzerFQN) {
if (defaultAnalyzerFQN == null) {
return new OLucenePerFieldAnalyzerWrapper(new StandardAnalyzer());
} else {
return new OLucenePerFieldAnalyzerWrapper(buildAnalyzer(defaultAnalyzerFQN));
}
}

private void setDefaultAnalyzerForRequestedKind(
final OIndexDefinition index,
final AnalyzerKind kind,
final ODocument metadata,
final String prefix,
final OLucenePerFieldAnalyzerWrapper analyzer) {
final String specializedAnalyzerFQN = metadata.field(kind.toString());
if (specializedAnalyzerFQN != null) {
for (final String field : index.getFields()) {
analyzer.add(field, buildAnalyzer(specializedAnalyzerFQN));
analyzer.add(prefix + field, buildAnalyzer(specializedAnalyzerFQN));
}
}
}

private void setSpecializedAnalyzersForEachField(
final OIndexDefinition index,
final AnalyzerKind kind,
final ODocument metadata,
final String prefix,
final OLucenePerFieldAnalyzerWrapper analyzer) {
for (final String field : index.getFields()) {
final String analyzerName = field + "_" + kind.toString();
final String analyzerStopwords = analyzerName + "_stopwords";

if (metadata.containsField(analyzerName) && metadata.containsField(analyzerStopwords)) {
final Collection<String> stopWords = metadata.field(analyzerStopwords, OType.EMBEDDEDLIST);
analyzer.add(field, buildAnalyzer(metadata.field(analyzerName), stopWords));
analyzer.add(prefix + field, buildAnalyzer(metadata.field(analyzerName), stopWords));
} else if (metadata.containsField(analyzerName)) {
analyzer.add(field, buildAnalyzer(metadata.field(analyzerName)));
analyzer.add(prefix + field, buildAnalyzer(metadata.field(analyzerName)));
}
}
}

private Analyzer buildAnalyzer(final String analyzerFQN) {
try {
final Class classAnalyzer = Class.forName(analyzerFQN);
final Constructor constructor = classAnalyzer.getConstructor();
return (Analyzer) constructor.newInstance();
} catch (final ClassNotFoundException e) {
throw OException.wrapException(
new OIndexException("Analyzer: " + analyzerFQN + " not found"), e);
} catch (final NoSuchMethodException e) {
Class classAnalyzer = null;
try {
classAnalyzer = Class.forName(analyzerFQN);
return (Analyzer) classAnalyzer.newInstance();
} catch (Exception e1) {
logger.error("Exception is suppressed, original exception is ", e);
//noinspection ThrowInsideCatchBlockWhichIgnoresCaughtException
throw OException.wrapException(
new OIndexException("Couldn't instantiate analyzer: public constructor not found"),
e1);
}
Comment on lines +95 to +106
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The exception handling and reflection logic in this catch (final NoSuchMethodException e) block seems a bit complex and potentially problematic:

  1. It logs the original NoSuchMethodException e with a message "Exception is suppressed, original exception is".
  2. It then attempts to instantiate the analyzer again using classAnalyzer.newInstance(), which is deprecated and can have issues with constructors throwing checked exceptions.
  3. If this second attempt fails, it catches Exception e1 and throws a new OIndexException wrapping e1, effectively losing the context of the original NoSuchMethodException e in the thrown exception chain (though e was logged).

Could this be simplified? For instance, if the goal is to try a public no-arg constructor, perhaps classAnalyzer.getDeclaredConstructor().newInstance() would be more direct and modern after classAnalyzer.getConstructor() fails. Also, ensuring the correct exception is wrapped and propagated might improve debuggability.

} catch (Exception e) {
logger.error(
"Error on getting analyzer for Lucene index (continuing with StandardAnalyzer)", e);
return new StandardAnalyzer();
}
}

private Analyzer buildAnalyzer(final String analyzerFQN, final Collection<String> stopwords) {
try {
final Class classAnalyzer = Class.forName(analyzerFQN);
final Constructor constructor = classAnalyzer.getDeclaredConstructor(CharArraySet.class);
return (Analyzer) constructor.newInstance(new CharArraySet(stopwords, true));
} catch (final ClassNotFoundException e) {
throw OException.wrapException(
new OIndexException("Analyzer: " + analyzerFQN + " not found"), e);
} catch (final NoSuchMethodException e) {
throw OException.wrapException(
new OIndexException("Couldn't instantiate analyzer: public constructor not found"), e);
} catch (final Exception e) {
logger.error(
"Error on getting analyzer for Lucene index (continuing with StandardAnalyzer)", e);
return new StandardAnalyzer();
}
}

public enum AnalyzerKind {
INDEX,
QUERY;

@Override
public String toString() {
return name().toLowerCase(Locale.ENGLISH);
}
}
}
Loading
Loading