Project

General

Profile

Rgw metadata search » History » Version 1

Yehuda Sadeh, 10/04/2016 06:36 PM

1 1 Yehuda Sadeh
h1. Rgw metadata search
2
3
I have recently been working on adding metadata search to rgw. It's
4
not in yet, nor is it completely ready. I do think that it's at a
5
point where it would be great to get some feedback. This feature is
6
built on top of another feature that I talked about a few months ago
7
on CDM, which is the "sync modules" (formerly known as "sync plugins")
8
feature. The current code can be found in the following PR:
9
10
https://github.com/ceph/ceph/pull/10731
11
12
The "sync modules" feature that the metadata search (via
13
elasticsearch) depends on provides the framework that allows
14
forwarding data (and metadata) to external tiers. It extends the
15
current multi-zone feature such that the current sync process uses the
16
default sync module. A sync module is a set of callbacks that are
17
called for each change that happens in data (and potentially in
18
metadata, e.g., bucket creation, new user, etc.; note: object's
19
metadata change is regaded as change in data) in a single zone-group.
20
The rgw multi-zone system is eventual consistent, so changes are not
21
applied synchronously.
22
23
A sync module is tied to a zone. Each zone has the module that is
24
responsible for handling the cross-zones data synchronization. A sync
25
module defines whether the zone can export data (e.g., regular rgw
26
data one), or can only digest data that was modified on another zone
27
(e.g., log zone, metadata search, etc.).
28
29
The zone definition within the zonegroup configuration has a new
30
'tier_type' field. This param controls which sync module will be used
31
for handling the cross-zone data sync. The zone private configuration
32
(that is not exposed to other zones) has a new section that can be
33
used to pass in sync-module specific configuration parameters. An
34
example of such param would be the endpoint of the elasticsearch
35
server that will be used by the module.
36
37
A sync module is tied to a zone. Each zone has the module that is
38
responsible for handling the cross-zones data synchronization. A sync
39
module defines whether the zone can export data (e.g., regular rgw
40
data one), or can only digest data that was modified on another zone
41
(e.g., log zone, metadata search, etc.)
42
43
The zone definition within the zonegroup configuration has a new
44
'tier_type' field. This param controls which sync module will be used
45
for handling the cross-zone data sync. See here:
46
47
<pre>
48
$ mrun c1 radosgw-admin zonegroup get
49
{
50
...
51
    "zones": [
52
        {
53
            "id": "cc602d3a-de81-4682-ad51-59765acad32c",
54
            "name": "us-west",
55
            "endpoints": [
56
                "http:\/\/localhost:8001"
57
            ],
58
            "log_meta": "false",
59
            "log_data": "true",
60
            "bucket_index_max_shards": 0,
61
            "read_only": "false",
62
            "tier_type": "elasticsearch"
63
        },
64
        {
65
            "id": "cde6b332-5bb8-4dc0-8219-b68932565ea1",
66
            "name": "us-east-1",
67
            "endpoints": [
68
                "http:\/\/localhost:8000"
69
            ],
70
            "log_meta": "true",
71
            "log_data": "true",
72
            "bucket_index_max_shards": 0,
73
            "read_only": "false",
74
            "tier_type": ""
75
        }
76
    ],
77
...
78
}
79
</pre>
80
81
The zone private configuration (that is not exposed to other zones)
82
has a new section that can be used to pass in sync-module specific
83
configuration parameters. An example of such param would be the
84
endpoint of the elasticsearch server that will be used by the module.
85
86
<pre>
87
$ mrun c2 radosgw-admin zone get
88
{
89
...
90
    "metadata_heap": "us-west.rgw.meta",
91
    "tier_config": [
92
        {
93
            "key": "endpoint",
94
            "val": "http:\/\/localhost:9200"
95
        }
96
    ],
97
...
98
}
99
</pre>
100
101
An example for a trivial sync module implementation is the log module,
102
which dumps to the rgw debug log every object that needs to be synced,
103
and the operation that associated with it (create, delete,
104
delete_marker). Note that for object creation it only provides the
105
name of the objects that needs to be synced, but not any extra
106
information about that object (such as object's size, and metadata).
107
We also have a not-as-trivial implementation that fetches all the
108
metadata associated with an object that needs to be synced from the
109
source zone. This serves as the foundation for the metadata indexing
110
module.
111
112
h2. metadata search via elasticsearch
113
114
The solution for metadata search we discussed was to use elasticsearch
115
to index metadata. When saying metadata here I refer to all the
116
metadata that is associated with data objects. Such metadata includes
117
the object names, containing bucket name, size, content type, user
118
specified metadata, acls, etc. We will also be able to index other
119
type of metadata that associated with the zone, such as existing
120
users, and bucket names.
121
122
Elasticsearch is an indexing framework that provides trivial
123
mechanisms to index data, and to perform queries on that. Indexing the
124
data is being through a RESTful api that allows HTTP PUTting and
125
POSTing a json structure that describes the data. Querying the index
126
is being done by simple GET operation.
127
128
A metadata indexing rgw zone consists of a ceph cluster (can
129
piggy-back on another existing zone, or can reside in its own ceph
130
cluster) that needs to hold the synchronization and rgw metadata
131
information. It also includes an elasticsearch server. The zone needs
132
to be configured with tier_type set to 'elasticsearch', and the zone
133
private configuration should point at the designated elasticsearch
134
server. Whenever a change happens in the zone group (or through the
135
initial full-sync process) the sync module updates elasticsearch with
136
the metadata associated with the object, or removes the object's
137
metadata altogether if needed. Once data is indexed, it is possible to
138
access elasticsearch and query the metadata by the the different index
139
keys.
140
141
An open question is whether we want to have rgw deal with the querying
142
api, or just leave it for elasticsearch to handle that. Another option
143
is to proxy metadata requests (via rgw) and send these to
144
elasticsearch. In any solution that requires that rgw is involved in
145
the querying process, we'll need to make up a RESTful api (that could
146
be based on the elasticsearch api). Leaving this outside of rgw is
147
trivial (from the perspective of rgw development).
148
149
Elasticsearch does not provide off the shelf security module (that is
150
open and free at least) such as user authentication and/or
151
authorization. Security is needed so that users could only run
152
metadata queries on their own data, and not on other users' data.
153
Current security mechanisms for elasticsearch require either a
154
proprietary non-open elasticsearch extension, or other rudimentary
155
open modules. The elasticsearch default "security" model is non
156
existent, and relies on limiting access to specific url prefix through
157
a proxy. It is an open question which way we want to go. We can leave
158
it for the users (provided that we document the setup), or address it
159
by proxying requests through rgw and reuse its own auth (which on one
160
hand will solve the authorization issue, but will introduce an api
161
problem, it's probably not going to be compatible with current
162
elasticsearch commonly used api).  We currently store which users have
163
read permissions to each object. This will allow filtering search
164
results so that only the authorized users could see the relevant
165
objects.
166
167
168
h3. status
169
170
The branch (referenced above through the PR), implements a trivial
171
sync module that pushes object's metadata to a configured
172
elasticsearch module. The main things that I want to add is
173
174
* limit zone sync to prespecified zones, and not all zones in the zonegroup
175
176
This is not necessarily unique to this feature, but can help here
177
quite a lot. There are a few edge cases that can cause some trouble
178
when syncing from more than one zone here (and are not an issue with
179
the regular rgw data sync, but due to some limitations can cause some
180
weirdness).
181
182
* elasticsearch preconfiguration / index setup
183
184
We currently don't control any of the indexes that are being created
185
by elasticsearch. We just post the objects and let elasticsearch do
186
what it does. This might not be optimal, or might even be problematic.
187
One way to handle that would be through preconfiguring the index when
188
first starting the sync process on the elasticsearch zone. Howeve,
189
it'll need bigger expertise in elasticsearch, as there are some tricky
190
issues (e.g., what to do with user custom defined metadata?)
191
192
And finally, here's an example of a query that retrieves all the
193
objects that have the custom 'color' metadata field set to 'silver':
194
195
<pre>
196
$ curl -XGET http://localhost:9200/rgw/_search?q=meta.custom.color:silver
197
{
198
    "took": 5,
199
    "timed_out": false,
200
    "_shards": {
201
        "total": 5,
202
        "successful": 5,
203
        "failed": 0
204
    },
205
    "hits": {
206
        "total": 1,
207
        "max_score": 0.30685282,
208
        "hits": [
209
            {
210
                "_index": "rgw",
211
                "_type": "object",
212
                "_id": "cde6b332-5bb8-4dc0-8219-b68932565ea1.4127.1:foo3:",
213
                "_score": 0.30685282,
214
                "_source": {
215
                    "bucket": "buck2",
216
                    "name": "foo3",
217
                    "instance": "",
218
                    "owner": {
219
                        "id": "yehsad",
220
                        "display_name": "yehuda"
221
                    },
222
                    "permissions": [
223
                        "yehsad"
224
                    ],
225
                    "meta": {
226
                        "size": 35,
227
                        "mtime": "2016-08-23T18:28:40.098Z",
228
                        "etag": "6216701a3418bdf56506670b91db3845",
229
                        "x-amz-date": "Tue, 23 Aug 2016 18:28:39 GMT",
230
                        "custom": {
231
                            "color": "silver"
232
                        }
233
                    }
234
                }
235
            }
236
        ]
237
    }
238
}
239
</pre>
240
241
h2. open questions
242
243
h3. special metadata search api
244
245
Do we need to integrate a special metadata search api into rgw? This
246
will require defining the new api. Another option is to just proxy
247
requests to elasticsearch. A third option is to do nothing.
248
249
* authorization
250
251
Authorization is an issue. As it is now, a user that can connect to
252
the elasticsearch server will be able by default to yank metadata of
253
all objects in the system. There are ways to configure a proxy on top
254
of elasticsearch to limit users operations. The read permissions field
255
that we provide as part of the indexed documents should help there.
256
- Should we just document how users need to configure the system?
257
- Do we need to investigate an external elasticsearch security module?
258
The ones I've seen didn't seem to quite cut it.
259
- proxy requests through rgw to force authorization through that?
260
261
* deep object inspection for metadata classification
262
263
As it is now we only look at the current metadata assigned to the
264
objects. Something that we can explore in the future is to allow for
265
some extensible way to look into the actual objects' data, and then
266
attach other metadata to these objects. E.g., for image files it could
267
store the resolution, or the EXIF data if exists. This can currently
268
be done by external scripts that would run over the data itself and
269
update the objects' metadata, but hooking it into the sync module
270
itself could be interesting.