その他 Elasticsearch 運用サポート

この章では、リポジトリインターフェース経由で直接アクセスできない Elasticsearch 操作の追加サポートについて説明します。これらの操作は、カスタムリポジトリの実装に従って、カスタム実装として追加することをお勧めします。

インデックス設定

Spring Data または Elasticsearch を使用して Elasticsearch インデックスを作成する場合、@Setting アノテーションを使用して異なるインデックス設定を定義できます。次の引数が使用できます。

useServerConfiguration は設定パラメーターを送信しないため、Elasticsearch サーバー構成によって設定パラメーターが決定されます。
settingPath は、クラスパスで解決できる必要がある設定を定義する JSON ファイルを指します。
shards 使用するシャードの数、デフォルトは 1
replicas レプリカの数、デフォルトは 1
refreshIntervall、デフォルトは "1s"
indexStoreType、デフォルトは "fs"

インデックスのソート (英語) を定義することもできます (可能なフィールド型と値については、リンクされた Elasticsearch ドキュメントを確認してください)。

@Document(indexName = "entities")
@Setting(
  sortFields = { "secondField", "firstField" },                                  (1)
  sortModes = { Setting.SortMode.max, Setting.SortMode.min },                    (2)
  sortOrders = { Setting.SortOrder.desc, Setting.SortOrder.asc },
  sortMissingValues = { Setting.SortMissing._last, Setting.SortMissing._first })
class Entity {
    @Nullable
    @Id private String id;

    @Nullable
    @Field(name = "first_field", type = FieldType.Keyword)
    private String firstField;

    @Nullable @Field(name = "second_field", type = FieldType.Keyword)
    private String secondField;

    // getter and setter...
}

1	ソートフィールドを定義するときは、Elasticsearch に定義されている名前ではなく、Java プロパティの名前（firstField）を使用します。( 最初のフィールド )
2	`sortModes`、`sortOrders`、`sortMissingValues` はオプションですが、設定されている場合、エントリの数は `sortFields` 要素の数と一致する必要があります。

インデックスマッピング

Spring Data Elasticsearch が IndexOperations.createMapping() メソッドを使用してインデックスマッピングを作成する場合、マッピングアノテーションの概要で説明されているアノテーション、特に @Field アノテーションを使用します。さらに、クラスに @Mapping アノテーションを追加することもできます。このアノテーションには次のプロパティがあります。

mappingPath JSON 形式のクラスパスリソース。これが空でない場合はマッピングとして使用され、他のマッピング処理は行われません。
enabled が false に設定されている場合、このフラグはマッピングに書き込まれ、それ以上の処理は行われません。
dateDetection および numericDetection は、DEFAULT に設定されていない場合、マッピング内の対応するプロパティを設定します。
dynamicDateFormats この String 配列が空でない場合、自動日付検出に使用される日付形式を定義します。
runtimeFieldsPath インデックスマッピングに書き込まれるランタイムフィールドの定義を含む JSON 形式のクラスパスリソース。例:

{
  "day_of_week": {
    "type": "keyword",
    "script": {
      "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
    }
  }
}

フィルタービルダー

Filter Builder によりクエリ速度が向上します。

private ElasticsearchOperations operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query query = NativeQuery.builder()
	.withQuery(q -> q
		.matchAll(ma -> ma))
	.withFilter( q -> q
		.bool(b -> b
			.must(m -> m
				.term(t -> t
					.field("id")
					.value(documentId))
			)))
	.build();

SearchHits<SampleEntity> sampleEntities = operations.search(query, SampleEntity.class, index);

大きな結果セットに対するスクロールの使用

Elasticsearch には、大きな結果セットをチャンクで取得するためのスクロール API があります。これは、Spring Data と Elasticsearch によって内部的に使用され、<T> SearchHitsIterator<T> SearchOperations.searchForStream(Query query, Class<T> clazz, IndexCoordinates index) メソッドの実装を提供します。

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query searchQuery = NativeQuery.builder()
    .withQuery(q -> q
        .matchAll(ma -> ma))
    .withFields("message")
    .withPageable(PageRequest.of(0, 10))
    .build();

SearchHitsIterator<SampleEntity> stream = elasticsearchOperations.searchForStream(searchQuery, SampleEntity.class,
index);

List<SampleEntity> sampleEntities = new ArrayList<>();
while (stream.hasNext()) {
  sampleEntities.add(stream.next());
}

stream.close();

SearchOperations API にはスクロール ID にアクセスするメソッドがありません。これにアクセスする必要がある場合は、AbstractElasticsearchTemplate の次のメソッドを使用できます (これは、さまざまな ElasticsearchOperations 実装の基本実装です)。

@Autowired ElasticsearchOperations operations;

AbstractElasticsearchTemplate template = (AbstractElasticsearchTemplate)operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query query = NativeQuery.builder()
    .withQuery(q -> q
        .matchAll(ma -> ma))
    .withFields("message")
    .withPageable(PageRequest.of(0, 10))
    .build();

SearchScrollHits<SampleEntity> scroll = template.searchScrollStart(1000, query, SampleEntity.class, index);

String scrollId = scroll.getScrollId();
List<SampleEntity> sampleEntities = new ArrayList<>();
while (scroll.hasSearchHits()) {
  sampleEntities.addAll(scroll.getSearchHits());
  scrollId = scroll.getScrollId();
  scroll = template.searchScrollContinue(scrollId, 1000, SampleEntity.class);
}
template.searchScrollClear(scrollId);

リポジトリメソッドでスクロール API を使用するには、戻り値の型を Elasticsearch リポジトリで Stream として定義する必要があります。メソッドの実装では、ElasticsearchTemplate のスクロールメソッドが使用されます。

interface SampleEntityRepository extends Repository<SampleEntity, String> {

    Stream<SampleEntity> findBy();

}

並べ替えオプション

ページングとソートで説明されているデフォルトのソートオプションに加えて、Spring Data Elasticsearch は、org.springframework.data.domain.Sort.Order から派生したクラス org.springframework.data.elasticsearch.core.query.Order を提供します。結果のソートを指定するときに Elasticsearch に送信できる追加のパラメーターを提供します (www.elastic.co/guide/en/elasticsearch/reference/7.15/sort-search-results.html (英語) を参照)。

また、検索操作の結果を地理的距離順に取得するために使用できる org.springframework.data.elasticsearch.core.query.GeoDistanceOrder クラスもあります。

取得するクラスに location という名前の GeoPoint プロパティがある場合、次の Sort は指定された点までの距離によって結果を並べ替えます。

Sort.by(new GeoDistanceOrder("location", new GeoPoint(48.137154, 11.5761247)))

ランタイムフィールド

Elasticsearch のバージョン 7.12 から、ランタイムフィールド (www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime.html (英語) ) の機能が追加されました。Spring Data、Elasticsearch はこれを 2 つの方法でサポートします。

インデックスマッピングの実行時フィールド定義

ランタイムフィールドを定義する最初の方法は、インデックスマッピングに定義を追加することです (www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-mapping-fields.html (英語) を参照)。Spring Data Elasticsearch でこの方法を使用するには、ユーザーは対応する定義を含む JSON ファイルを提供する必要があります。例:

例 1: runtime-fields.json

{
  "day_of_week": {
    "type": "keyword",
    "script": {
      "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
    }
  }
}

この JSON ファイルへのパスはクラスパス上に存在する必要があり、エンティティの @Mapping アノテーションに設定する必要があります。

@Document(indexName = "runtime-fields")
@Mapping(runtimeFieldsPath = "/runtime-fields.json")
public class RuntimeFieldEntity {
	// properties, getter, setter,...
}

クエリに設定された実行時フィールド定義

ランタイムフィールドを定義する 2 番目の方法は、検索クエリに定義を追加することです (www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-search-request.html (英語) を参照)。次のコード例は、Spring Data Elasticsearch を使用してこれを行う方法を示しています。

使用されるエンティティは、price プロパティを持つ単純なオブジェクトです。

@Document(indexName = "some_index_name")
public class SomethingToBuy {

	private @Id @Nullable String id;
	@Nullable @Field(type = FieldType.Text) private String description;
	@Nullable @Field(type = FieldType.Double) private Double price;

	// getter and setter
}

次のクエリでは、価格に 19% を加算して priceWithTax 値を計算するランタイムフィールドを使用し、この値を検索クエリで使用して、priceWithTax が指定された値以上であるすべてのエンティティを検索します。

RuntimeField runtimeField = new RuntimeField("priceWithTax", "double", "emit(doc['price'].value * 1.19)");
Query query = new CriteriaQuery(new Criteria("priceWithTax").greaterThanEqual(16.5));
query.addRuntimeField(runtimeField);

SearchHits<SomethingToBuy> searchHits = operations.search(query, SomethingToBuy.class);

これは、Query インターフェースのすべての実装で機能します。

ポイントインタイム (PIT) API

ElasticsearchOperations は、Elasticsearch のポイントインタイム API をサポートします (www.elastic.co/guide/en/elasticsearch/reference/8.3/point-in-time-api.html (英語) を参照)。次のコードスニペットは、架空の Person クラスでこの機能を使用する方法を示しています。

ElasticsearchOperations operations; // autowired
Duration tenSeconds = Duration.ofSeconds(10);

String pit = operations.openPointInTime(IndexCoordinates.of("person"), tenSeconds); (1)

// create query for the pit
Query query1 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Smith"))
    .withPointInTime(new Query.PointInTime(pit, tenSeconds))                        (2)
    .build();
SearchHits<Person> searchHits1 = operations.search(query1, Person.class);
// do something with the data

// create 2nd query for the pit, use the id returned in the previous result
Query query2 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Miller"))
    .withPointInTime(
        new Query.PointInTime(searchHits1.getPointInTimeId(), tenSeconds))          (3)
    .build();
SearchHits<Person> searchHits2 = operations.search(query2, Person.class);
// do something with the data

operations.closePointInTime(searchHits2.getPointInTimeId());                        (4)

1	インデックス (複数の名前を指定できます) とキープアライブ期間の特定の時点を作成し、その ID を取得します
2	その ID をクエリに渡して、次のキープアライブ値と一緒に検索します
3	次のクエリでは、前の検索から返された ID を使用します。
4	完了したら、最後に返された ID を使用してその時点を閉じます

検索テンプレートのサポート

検索テンプレート API の使用がサポートされています。これを使用するには、まず保存されたスクリプトを作成する必要があります。ElasticsearchOperations インターフェースは、必要な機能を提供する ScriptOperations を継承します。ここで使用する例では、firstName という名前のプロパティを持つ Person エンティティがあることを前提としています。検索テンプレートスクリプトは次のように保存できます。

import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.script.Script;

operations.putScript(                            (1)
  Script.builder()
    .withId("person-firstname")                  (2)
    .withLanguage("mustache")                    (3)
    .withSource("""                              (4)
      {
        "query": {
          "bool": {
            "must": [
              {
                "match": {
                  "firstName": "{{firstName}}"   (5)
                }
              }
            ]
          }
        },
        "from": "{{from}}",                      (6)
        "size": "{{size}}"                       (7)
      }
      """)
    .build()
);

1	`putScript()` メソッドを使用して検索テンプレートスクリプトを保存する
2	スクリプトの名前 /ID
3	検索テンプレートで使用されるスクリプトは、Mustache 言語である必要があります。
4	スクリプトソース
5	スクリプト内の検索パラメーター
6	ページングリクエストのオフセット
7	ページングリクエストのサイズ

検索クエリで検索テンプレートを使用するために、Spring Data Elasticsearch は org.springframework.data.elasticsearch.core.query.Query インターフェースの実装である SearchTemplateQuery を提供します。

SearchTemplateQuery は Query インターフェースの実装ですが、Pageable や Sort の設定など、基本クラスによって提供されるすべての機能が SearchTemplateQuery で使用できるわけではありません。この機能の値は、次のページングパラメーターの例に示すように、保存されたスクリプトに追加する必要があります。これらの値が Query オブジェクトに設定されている場合、無視されます。

次のコードでは、これをリポジトリ呼び出しに統合する方法の例として、検索テンプレートクエリを使用した呼び出しをカスタムリポジトリ実装 ( カスタムリポジトリの実装を参照) に追加します。

まず、カスタムリポジトリフラグメントインターフェースを定義します。

interface PersonCustomRepository {
	SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable);
}

このリポジトリフラグメントの実装は次のようになります。

public class PersonCustomRepositoryImpl implements PersonCustomRepository {

  private final ElasticsearchOperations operations;

  public PersonCustomRepositoryImpl(ElasticsearchOperations operations) {
    this.operations = operations;
  }

  @Override
  public SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable) {

    var query = SearchTemplateQuery.builder()                               (1)
      .withId("person-firstname")                                           (2)
      .withParams(
        Map.of(                                                             (3)
          "firstName", firstName,
          "from", pageable.getOffset(),
          "size", pageable.getPageSize()
          )
      )
      .build();

    SearchHits<Person> searchHits = operations.search(query, Person.class); (4)

    return SearchHitSupport.searchPageFor(searchHits, pageable);
  }
}

1	`SearchTemplateQuery` を作成する
2	検索テンプレートの ID を指定します
3	パラメーターは `Map<String,Object>` で渡されます
4	他のクエリ型と同じ方法で検索を実行します。

入れ子のソート

Spring Data Elasticsearch はネストされたオブジェクト内のソートをサポートします ( www.elastic.co/guide/en/elasticsearch/reference/8.9/sort-search-results.html#nested-sorting (英語) )

org.springframework.data.elasticsearch.core.query.sort.NestedSortIntegrationTests クラスから抜粋した次の例は、ネストされた並べ替えを定義する方法を示しています。

var filter = StringQuery.builder("""
	{ "term": {"movies.actors.sex": "m"} }
	""").build();
var order = new org.springframework.data.elasticsearch.core.query.Order(Sort.Direction.DESC,
	"movies.actors.yearOfBirth")
	.withNested(
		Nested.builder("movies")
			.withNested(
				Nested.builder("movies.actors")
					.withFilter(filter)
					.build())
			.build());

var query = Query.findAll().addSort(Sort.by(order));

フィルタークエリについて: ここで CriteriaQuery を使用することはできません。このクエリは、フィルターコンテキストでは機能しない Elasticsearch ネストクエリに変換されるためです。ここで使用できるのは StringQuery または NativeQuery のみです。上記の用語クエリのように、これらのいずれかを使用する場合は、Elasticsearch フィールド名を使用する必要があります。これらが @Field(name="…") 定義で再定義されるときは注意してください。

順序パスとネストされたパスの定義には、Java エンティティのプロパティ名を使用する必要があります。