DF100 - 04 - Storage and Retrieval With Arrays
DF100 - 04 - Storage and Retrieval With Arrays
Release: 20240216
Topics we cover
Array Queries
Array Updates
Expressive Updates
Upsert Operations
FindOneAndUpdate
3
Querying Arrays
4
A surprising array query
If so, why?
5
Using $elemMatch
When we match against an array, we check that each requirement matches at least one element
of the array - but it doesnt need to be the same element.
For that we have to use $elemMatch
Exercise: $elemMatch
In sample_restaurants.restaurants, there are hygiene ratings and dates of inspection
details.
Executing this query to find restaurants that have had a hygiene rating of C after 2013
finds 2,675 restaurants:
db.restaurants.find({
"grades.grade":"C",
"grades.date":{$gt:ISODate("2013-12-31")}
})
This is incorrect as finds one that hasn't had a C since 2014. What would be the correct
query?
Try this query to get the shape of a restaurant with some relevant data.
MongoDB> db.restaurants.findOne()
{ "_id" : ObjectId("5eb3d668b31de5d588f4292a"),
"address" : {
"building" : "2780",
"coord" : [ -73.98241999999999, 40.579505 ],
"street" : "Stillwell Avenue",
"zipcode" : "11224"
},
"borough" : "Brooklyn",
"cuisine" : "American",
"grades" : [
{ "date" : ISODate("2014-06-10T00:00:00Z"), "grade" : "A", "score" : 5 },
{ "date" : ISODate("2013-06-05T00:00:00Z"), "grade" : "A", "score" : 7 },
{ "date" : ISODate("2012-04-13T00:00:00Z"), "grade" : "A", "score" : 12 },
{ "date" : ISODate("2011-10-12T00:00:00Z"), "grade" : "C", "score" : 12 }
],
"name" : "Riviera Caterer",
"restaurant_id" : "40356018"
}
Updating Arrays
8
Basic Array updates - $push
{ _id : ObjectId("..."),
not already exist name : "funky",
tracks : [
{artist : "Queen", track : "Bicycle Race" },
Can be used with various {artist : "Abba", track : "Waterloo" },
{artist : "AC/DC", track : "Thunderstruck" } ]
}
modifiers
9
$push adds a value to the end of the array. You can add modifiers to $push to change this
behaviour
Basic Array updates - $pop
MongoDB> db.playlists.updateOne({name:"funky"},
updateMany commands { $pop: {tracks: 1}})
MongoDB> db.playlists.find({},{_id:0})
MongoDB> db.playlists.find({},{_id:0})
{ name : "funky",
tracks : [{artist : "abba", track : "Waterloo" }]
}
10
$pop is used to remove things from an array - either the first or last element.
Internally, each array element has an index number - the first element has index 0, second
element has index 1, etc
If you remove the first element, the second element will become the first, etc, so the index
numbers must all be updated.
Basic Array updates - $pull
MongoDB> db.playlists.updateOne({name:"funky"},
{ $pull : { tracks : { artist: "Queen" }}}) //remove all
elements where the value of 'artist' is 'Queen'
MongoDB> db.playlists.find({},{_id:0}).pretty()
{ name : "funky",
tracks : [{artist : "Abba", track : "Waterloo" },
{artist : "AC/DC", track : "Hells Bells" }]}
11
For $pull you specify a value or query and all matching values are removed from the array.
Array Operators - $addToSet
12
$addToSet makes the array a unique list of values by not adding a value if the value is already
there.
If there is already a duplicate this will not convert the array to a set, it will just not add another
instance of the value.
Arrays and $each
Passing an array of elements //Add hours for Thursday and Friday Incorrectly
> db.a.updateOne({name:"Tom"}, {$push:{hrs:[2,9]}})
with push adds them as one { acknowledged: true, matchedCount: 1, modifiedCount: 1 }
array element inside the original > db.a.find({},{_id:0})
array { name : "Tom", hrs : [ 4, 1, 3, [ 2, 9 ] ] }
13
In the example, $push does not give the desired effect - the array itself is added to the existing
array, not the individual elements
Arrays and $each
> db.a.find({},{_id:0})
{ name : "Tom", hrs : [ 4, 1, 3, 2, 9 ] }
14
> db.a.find({},{_id:0})
{ name : "Tom", hrs : [ 9, 4, 3, 2, 1 ] }
15
Elements can also be sorted as they are pushed but doing this client side will likely get a
undesired effects.
Example,
If following steps are done simultaneously by different users, we might end up overwriting data:
User1
doc = db.a.findOne({name: "Tom"})
doc.hrs.push(2)
doc.hrs.push(9)
doc.hrs.sort()
db.a.updateOne({name: "Tom"}, {$set: doc})
User2
doc = db.a.findOne({name: "Tom"})
doc.hrs.push(3)
doc.hrs.push(5)
doc.hrs.sort()
db.a.updateOne({name: "Tom"}, {$set: doc})
$sort and $slice
16
> db.a.find({},{_id:0})
{ "name" : "Tom", "hrs" : [ 0, 0, 1, 1, 0 ] }
17
18
19
Array filters allow us to select specific members of an array to modify by defining addition match
criteria/queries.
They are separate from the find() part of the query so we can find() with one query but then decide
how to update the array based on different criteria
If we do include some of the filter criteria in the find() part as we have done here - then we can
avoid the computation of even trying to update records where the filter won't match.
We can define multiple named array filters in our update - we called this one nohrs, nohrs is called
an identifier
In the filter the identifier is evaluated against each array element and those that match are
updated.
Here we say If the element is less than one then update it
If the array elements are objects - we can dereference them in the filter with dots - e.g in an array
of items on a purchase receipt { "arrayFilters" : [ { "item.type" : "socks, "item.price" : {$gt:5} } ] }
Exercise: Array updates
20
Hint: Use the $avg operator for calculating the mean score.
How do you get the Students worst score to a known place in the array?
Expressive Updates
db.shapes.updateMany({
Expressive updates use shapename: {
aggregation expressions to define $in:["rectangle","square"]
the fields to $set }},
[{
Updates can be based on values in $set: {
the document or calculated area: {
conditions $multiply:
["$w","$h"]
Example: Add an area field to all }}}])
our rectangular records. If we add a
field like this we can build an index
on it.
21
Updates can also be performed using an aggregation pipelines expression - we cover them later in
the course.
This can be used to update a document based on other values in the document e.g. if a=1 then add
2 to b else add c to b
The update operation is an array containing a pipeline that outputs the new changes
The update mutation is an array of pipeline stages, hence the square brackets.
Upsert
"upsert: true"
>//Nothing found to update - we have no player "joe"
> db.players.updateOne({name:"joe"},{$inc:{games:1}},{upsert:true})
{ acknowledged : true, matchedCount : 0, modifiedCount : 0,
Upsert inserts a new document upsertedId: ObjectId("...") }
> db.players.find()
if none are found to update { _id : ObjectId("..."), name : "joe", games : 1 }
>//Document created because of upsert
22
Useful in a few design patterns and simplifies code versus a call to update then to insert.
findOneAndUpdate()
23
If we wanted to update a value and fetch it, to create a sequence say for a "Customer number" or
"Invoice NUmber" we could do it like this - and in a single thread this is fine. If there is no
projection applied in the findOne() commands, the result would still be the same as there is only
one field apart from the _id field.
findOneAndUpdate()
24
If we extend this to multiple threads then we can have a race condition where we update twice
then fetch twice.
It would be fine if we don’t provide the projection in findOne() commands as there is only one field
other than the _id. It would give us the same result.
findOneAndUpdate()
problem
25
With findOneAndUpdate the single, atomic command finds, modifies and returns the document.
By default it will return the document BEFORE the update. Use the {returnNewDocument: true}
parameter to return the document AFTER the update.
You can apply things like sorting and projection to control what it finds and modifies.
It only affects one document and also will work with upsert.
Recap
Recap
There are multiple ways to update
specific array elements
FindOneAndUpdate atomically
returns the version of the updated
document
26
Quiz Time!
27
#1. Select three queries which will
return a document containing the field:
{ scores: [1,2,3] }
find ({ find ({
A find ({scores: 3}) B })
scores: [3,2,1] C })
scores: {$all: [1,7]}
find ({
find ({
$expr: {
D scores: {$in: [1,3,7]}
})
E $eq:[{$sum:"$scores"}, 6]
}})
find ({ find ({
A find ({scores: 3}) B })
scores: [3,2,1] C })
scores: {$all: [1,7]}
find ({
find ({
$expr: {
D scores: {$in: [1,3,7]}
})
E $eq:[{$sum:"$scores"}, 6]
}})
29
A Stops contention
between writers B client from
overwriting C Reduces network
traffic
another's changes
Multiple retries
D
Excess locking can
be prevented E from writers can
be avoided
Multiple retries
D
Excess locking can
be prevented E from writers can
be avoided
31
#3. Consider the following operations executed on an empty collection:
updateOne({_id: 700}, {$push: {sensor: {$each: [4,0,2], $sort: -1}}},{upsert: true})
updateOne({_id:700 }, { $push: { sensor: { $each: [0, 7, 3, 4, 5], $slice: 5}}})
Select the number of array elements stored in the sensor array field:
There are 8
D elements in total
in the array
There are 8
D elements in total
in the array
33
The $slice operator limits the number of array elements to 5 in this case.
Exercise
Answers
34
Answer -Exercise: $elemMatch
MongoDB> db.restaurants.findOne()
{
"_id" : ObjectId("5eb3d668b31de5d588f4292a"),
"address" : {
"building" : "2780",
Use $elemMatch to find where an array element "coord" : [
-73.98241999999999,
matches the query: ],
40.579505
} "grade" : "A",
"score" : 12
},
}) {
"date" : ISODate("2011-10-12T00:00:00Z"),
"grade" : "C",
"score" : 12
}
Answer: 722 Restaurants to avoid. ],
"name" : "Riviera Caterer",
"restaurant_id" : "40356018"
}
35
This time we want the Grade and the Date to be in the same element, it will not match this
document.
db.grades.updateMany({
scores:{$elemMatch:{type:"homework", score:{$gt:90}}}},
{$inc: {"scores.$[filter].score":-10}},
36
In the grades collection add a new field containing their the mean
score in each class.
db.grades.updateMany( {},
[{ $set: { average : { $avg: "$scores.score" }}}]
)
37
> db.playlists.find({},{_id:0}).pretty()
{ "name" : "funky",
"tracks" : [{"artist":"Queen", "track" : "Bicycle Race" },
{"artist":"AC/DC", "track" : "Thunderstruck" }]}
40
For $pullAll you specify an array of values and all instances of any of them are removed.
Updating Array Elements
41
In the next set of slides we look at different ways to update specific members of arrays.
Imagine our database contains information about the number of hours a person "Tom" will work so
the array "hrs" of 5 members represents days of the week, each is how many hours they need to
work that day.
This example is kept very minimalist for clarity.
Arrays - Update All Elements
42