Thank you for this reference :)

What does u refer to in this slide?

There is a programming language that can make the simulation implementation easier: https://taichi.graphics/ The authors have recently created a startup company to bring this further!

Yes, you are absolutely right. To explain, the plots show a probability distribution. In other words, if we draw samples and plot their values, we could get histogram curves like those shown. We want to know which collection of samples would have lower variance. That would indeed be the plot on the left, intuitively because more of the samples would be clumped up closer to the mean.

Thank you for pointing this out.

In lecture, Professor Pollard said the right one had lower variance. I think she misunderstood the graphs and that the left one has lower variance.

If the ray does not intersect the plane, then (N^T . d) should be zero, because the ray has no component in the direction normal to the plane. That is the situation to check for the edge case.

We saw that we can visit the child that is closest in an attempt to save work. If we only need to know if there is some hit and don't care where it is, we may use other metrics (e.g., density) to make a decision as well.

This is an interesting thought. I am aware of research using machine learning for fast collision detection (https://www.ucsdarclab.com/fastron) but I do not know of anyone yet solving exactly the problem you propose.

In class, we ran through an example of a ray that doesn't intersect with the plane, and through the calculations, we got t = 0. Is that always the case for a non-intersecting ray? Does a non-intersecting ray always generate 0 or it can generate anything that's not c?

when visiting the children, should we sometimes swap the order and visit child2 first if it is more promising?

just random thought: these adapitive structures have handcraft splitting rules. Is there anyway to learn the data structures for graphics using machine learning? Related paper: https://arxiv.org/abs/1712.01208

What is means is that if f(your_point) = 0, then your_point satisfies the function (i.e., your_point is on the line or plane or complicated implicit surface defined by this equation). If f(your_point) != 0, then your_point does not satisfy the function and it is not contained in the shape that you have implicitly defined.

The DoF *decreases*, because we are moving from cubic Bezier (3rd order curve made by weighting 4 basis functions) to quadratic (2nd order curve made by weighting 3 basis functions)

Indeed it is, thank you. As I am unable to coerce keynote to save this image into the pdf properly, I have uploaded it onto piazza.

Does this mean we need to find a function f such that any values of x,y,z will always yield 0? If so, how is the equation in next slide always yielding 0? If I put x = 1, I get f = 1-1.23 = 0.23 != 0..... :/

Why can't we do this with quadratic Bezier curves? It seems like only the degrees of freedom increases, while the # of constraints stays the same, meaning DOF > # of constraints so we can find a satisfying assignment of control points.

I think the graphic at the top of the slide is cut off

It's good that the order of the averaging does not matter. So we can think this as averaging two bilinear filtering. In terms of implementation we can reuse the bilinear code.

In this case, I would start with i=-1, j=-1. Then we would be interpolating between colors of a different set of pixels. If f00 is the lower left hand corner of the image, we'd have to think about what we do if we run off the boundary of the texture. In this case, let's assume we just copy the nearest color when we are outside the boundary. In that case, we have an f_{-1,-1}, f_{-1,0}, f_{0,-1}, and f_{0,0}, all of which are the same color. It wouldn't matter in that case, however, just to finish the example, we would set s=0.1-(-1+1/2)=0.6 and t=0.6 and carry through the interpolation with (1-t)((1-s)f_{-1,-1} + sf__{0,-1}) + t((1-s)f_{-1,0}+sf_{0,0})

So it would be the case that if (u, v) = (0.1, 0.1). Then if we calculate the i, j according to the formula: i<0 j<0, so it seems we should assign s=1 and t=1 and (u, v) would be the color f11, right?

We had a question in class about why turning on supersampling (e.g., 8x, which should be taking 8 times the number of samples) does not tend to cause a corresponding decrease in runtime. The short answer is that there is some cleverness going on under the hood that cuts down on the number of samples that have to be taken in total to get each individual frame on the screen. The following is one reference for how some of this is done and includes a small set of timing numbers: https://www.pcgamer.com/pc-graphics-options-explained/2/

From the posting, we also see that MLAA (Morphological Anti-aliasing) is a postprocessing step on the entire image, and so it will not have the same scaling behavior, because it does not require taking more samples. Instead, it looks for edges in the image and based on the edge geometry applies an appropriate blur using image processing techniques to make them look better.

Ok, so what does closed under composition even mean here --- in this case, it means that we should end up only with colors that are a mix of the colors we started with. Specifically, if we take a bunch of "bright reds" and perform the "over" operation on them, we should only ever end up with more "bright reds." The non-premultiplied alpha fails in this case, because it takes a group of things (well, 2 things) with RGB value (1,0,0) and returns a new thing with RGB value (0.75,0,0), whereas the premultiplied alpha returns another (1,0,0).

I'm not sure how useful this observation is in the general case, because if you start with a set of different color values (e.g., (1,0,0) and (0,1,0)) you can get different mixes of those colors depending on the relative alpha values, and this is true for both premultiplied and non-premultiplied alpha (although the exact mixes of red and green will differ for the two methods). So, to my mind, I just treat this example as an interesting failure case that shows yet another reason why premultiplied alpha is probably a good idea.

Examples: tiles, grass, trees (far away)...

I changed this slide (and the following one) to include more details (and both variations for aligning the camera).

All modern displays are raster displays. However, that hasn't stopped the interest in vector based representations, due to their many advantages in situations where you want to at least have a representation of your image that doesn't have built-in aliasing (think fonts).

Research in vector graphics continues, and there has always been an undercurrent of discussion of whether we may want vector displays for some applications. Check out this recent paper for some perspectives: https://people.csail.mit.edu/tzumao/diffvg/

We probably do not want to use a cross product operation to try to get rotation by arbitrary theta. That seems to become a chicken and egg problem because the result from a cross product is always going to be orthogonal to both vectors, and so we have to find a vector orthogonal to the one we want to start.

One way to answer this question (not the only way!) would be to use the basis vectors we already have.

So, for example, cos(theta)u + sin(theta)(Nxu) where u and (Nxu) are orthogonal basis vectors in the plane.

Yes that makes sense.

At pretty much any sampling rate it is a good idea to do this kind of stratified sampling where you subdivide a pixel into equal regions and sample randomly in each region. The initial subdivision helps guard against bad random number pulls, which could lump samples in a single part of the pixel by chance, and the "jittering" within each subregion helps guard against aliasing which may occur at this finer level (e.g., imagine a high frequency checkerboard where sampling in the centers of each region pulls out only the whites and skips the blacks).

One of the goals of this process was to turn the perspective projection into a matrix operation that could be done as part of a long sequence of matrix ops that transform the geometry of your scene -- i.e., move them into position, put them in the camera's frame of view, do the projection. If we can create one single matrix to do all of that, then we form the matrix once and apply it (possibly in parallel) to a large number of triangles. The GPU is highly optimized to do all that.

Given our goal of making it a matrix operation in the first place, we need to copy z into the homogeneous coordinate because there is no way to do division through matrix multiplication. The division comes from the "trick" of dividing everything out by the homogeneous coordinate at the end of all the matrix operations.

Does that answer your question?

I think different colors represent different triangles.

One property about rotations is that it must be kept at the origin to remain a linear transformation. If the rotation is performed at point x, we will experience issues with the additivity [f(u+v)=u+v+x while f(u)+f(v)=u+v+2x] and homogeneity property [f(au)=au +x while af(u)=au+ax]. Hence, the rotation won't be performed correctly if translation does not occur.

Could I get some further explanation on the third bullet point? I'm not sure why we need a matrix to copy the z coordinate into the homogeneous coordinate. Why can't we be satisfied with representing the 2D coordinate (e.g. (u,v) or (x/z,y/z)) as its homogeneous coordinate: (x/z, y/z)?

Thanks!

Does it seem that if the sample rate is low, it is not a good strategy to put the sample in the random position of sub-pixels? And if the sample rate is high it would get a better result than just putting the sample in the centers of sub-pixels...

If we rotate without translating first, the rotation will be done about the origin, which produces a wrong result.

I have read about other types of displays that do not use pixels (https://en.wikipedia.org/wiki/Vector_monitor). I image the computer graphics pipeline will be different on these devices, but they are no longer used anyway.

This is a little confusing for me- does the light grey and dark grey represent how much of the pixel is covered? If so, why is the very top right triangle only light grey around the left edge? Shouldn't it be dark grey since it's a left edge?

Just want to double check my understanding of everything. Is it true that `maxarg(F) = g`

ie that to maximize `F`

we'd pass in input `g`

. I think this is true because nothing can be more aligned with `g`

than itself but I want to double check.

edit: I want to take that back. I think that the `maxarg`

wouldn't be defined because we could always pass in `cg`

for some $$c \in \mathbb{N}$$ to get a larger output. That'd be an input perfectly aligned with `g`

but scaled and that'd scale the output. ie `<<cg,g>> = c<g,g>

Just to check my understanding. Would the solution to the second question be that the `n`

not be a normal and instead by at angle `theta`

to `u`

. I'm not sure though is the resulting vector would still be in the plane. I think so.

u is simply a parameter that varies from 0 at the first control point to 1 at the second control point. For animations, it may be time. However, for a spline represented based on control points in space, it does not have a specific meaning. In particular, changes in u do not match changes in arclength of the curve, so we cannot even use it directly to break up the curve into identical arclength segments.