By using the merge
function and its optional parameters:
Inner join: merge(df1, df2)
will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId")
to make sure that you were matching on only the fields you desired. You can also use the by.x
and by.y
parameters if the matching variables have different names in the different data frames.
Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)
Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)
Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)
Cross join: merge(x = df1, y = df2, by = NULL)
Just as with the inner join, you would probably want to explicitly pass "CustomerId" to R as the matching variable. I think it's almost always best to explicitly state the identifiers on which you want to merge; it's safer if the input data.frames change unexpectedly and easier to read later on.
You can merge on multiple columns by giving by
a vector, e.g., by = c("CustomerId", "OrderId")
.
If the column names to merge on are not the same, you can specify, e.g., by.x = "CustomerId_in_df1", by.y = "CustomerId_in_df2"
where CustomerId_in_df1
is the name of the column in the first data frame and CustomerId_in_df2
is the name of the column in the second data frame. (These can also be vectors if you need to merge on multiple columns.)
While the question has been answered, I'd like to add some useful tips when using matplotlib.pyplot.savefig
. The file format can be specified by the extension:
from matplotlib import pyplot as plt
plt.savefig('foo.png')
plt.savefig('foo.pdf')
Will give a rasterized or vectorized output respectively, both which could be useful. In addition, there's often an undesirable, whitespace around the image, which can be removed with:
plt.savefig('foo.png', bbox_inches='tight')
Note that if showing the plot, plt.show()
should follow plt.savefig()
, otherwise the file image will be blank.
Best Answer
dev.off()
closes the current graphical device. This clears all of the plots for me in RStudio as long as I don't have a different graphical device open at the moment. If you do have other graphical devices open then you can usedev.list()
to figure out which graphical device is RStudio's. The following should do that but I haven't tested it thoroughly.But if you aren't doing anything else then just using
dev.off()
should take care of it.