CS474: Human Computer Interaction - Augmented Reality (100 Points)

Assignment Goals

The goals of this assignment are:

To write a program that uses augmented reality to display dynamic educational material
To use opencv to create a dynamic image and display it as an AR overlay

Background Reading and References

Please refer to the following readings and examples offering templates to help get you started:

The Assignment

In this assignment, you will incorporate the augmented reality program we explored in class into a user application. Specifically, you will write a program to solve one of two problems:

Wireless hotspot locations and relative signal strengths
A scavanger hunt / capture the flag game on campus using clues and hot/cold markers to inform the user

For your application, detect a particular marker on the screen (if you like, you can use a marker other than the default ArUCo dictionary, or overlay your information over the entire screen without a marker), and create a dynamic image to display meaningful information on the webcam overlay.

In addition to your implementation, be sure to include a LaTeX design report in academic journal format (you can use Overleaf for this purpose) that describes your initial design, rationale, stakeholder evaluation, and any subsequent revisions you made from your stakeholder input.

Example Code for ArUCo Card Detection and Image Overlay

Here is an example for how to detect the ArUCo Pantone Color Cards:

# https://www.pyimagesearch.com/2021/01/04/opencv-augmented-reality-ar/
# pip install opencv-contrib-python imutils

import numpy as np
import argparse
import imutils
import sys
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-s", "--source", required=True,
    help="path to input source image that will be put on input")
args = vars(ap.parse_args())

cap = cv2.VideoCapture(0)
ret, image = cap.read()

source = cv2.imread(args["source"])

# Initialize a dictionary to store the last known corners for each ArUco marker
cached_corners = {}

while True:
    ret, image = cap.read()
    
    (imgH, imgW) = image.shape[:2]  
    
    print("[INFO] detecting markers...")
    arucoDict = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_ARUCO_ORIGINAL)
    arucoParams = cv2.aruco.DetectorParameters()
    corners, ids, rejected = cv2.aruco.detectMarkers(image, arucoDict, parameters=arucoParams)

    cv2.aruco.drawDetectedMarkers(image, corners)
    
    # Update cached corners if new ones are found
    if ids is not None:
        for id_val, corner in zip(ids.flatten(), corners):
            cached_corners[id_val] = corner  # Update or add the corner for this ID
    
    # Check if we have all four required corners in the cache
    all_corners_found = all(id_val in cached_corners for id_val in [923, 1001, 241, 1007])
    
    if all_corners_found:
        # If all corners are found, update 'corners' to use the cached corners in order
        corners = [cached_corners[id_val] for id_val in [923, 1001, 241, 1007]]
        ids = np.array([923, 1001, 241, 1007]).reshape(-1, 1)  # Reshape for compatibility with later code
    else:
        print("[INFO] could not find 4 corners; found {}... press any key to continue or q to quit".format(len(cached_corners)))
        cv2.imshow("Input", image)
        key = cv2.waitKey(1) & 0xFF
        if key == ord('q'):
            break
        continue
    
    # Construct augmented reality visualization only if all corners are found
    print("[INFO] constructing augmented reality visualization...")
    refPts = [np.squeeze(corner) for corner in corners]  # Flatten corner arrays
    
    # Define the *destination* transform matrix, ensuring points are in the correct order
    (refPtTL, refPtTR, refPtBR, refPtBL) = refPts
    dstMat = [refPtTL[0], refPtTR[1], refPtBR[2], refPtBL[3]]
    dstMat = np.array(dstMat)
    
    # grab the spatial dimensions of the source image and define the
    # transform matrix for the *source* image in top-left, top-right,
    # bottom-right, and bottom-left order
    (srcH, srcW) = source.shape[:2]
    srcMat = np.array([[0, 0], [srcW, 0], [srcW, srcH], [0, srcH]])
    # compute the homography matrix and then warp the source image to the
    # destination based on the homography
    (H, _) = cv2.findHomography(srcMat, dstMat)
    warped = cv2.warpPerspective(source, H, (imgW, imgH))

    # construct a mask for the source image now that the perspective warp
    # has taken place (we'll need this mask to copy the source image into
    # the destination)
    mask = np.zeros((imgH, imgW), dtype="uint8")
    cv2.fillConvexPoly(mask, dstMat.astype("int32"), (255, 255, 255),
        cv2.LINE_AA)
    # this step is optional, but to give the source image a black border
    # surrounding it when applied to the source image, you can apply a
    # dilation operation
    rect = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    mask = cv2.dilate(mask, rect, iterations=2)
    # create a three channel version of the mask by stacking it depth-wise,
    # such that we can copy the warped source image into the input image
    maskScaled = mask.copy() / 255.0
    maskScaled = np.dstack([maskScaled] * 3)
    # copy the warped source image into the input image by (1) multiplying
    # the warped image and masked together, (2) multiplying the original
    # input image with the mask (giving more weight to the input where
    # there *ARE NOT* masked pixels), and (3) adding the resulting
    # multiplications together
    warpedMultiplied = cv2.multiply(warped.astype("float"), maskScaled)
    imageMultiplied = cv2.multiply(image.astype(float), 1.0 - maskScaled)
    output = cv2.add(warpedMultiplied, imageMultiplied)
    output = output.astype("uint8")    
    
    # show the source image, output of our augmented reality
    cv2.imshow("Input", image)
    cv2.imshow("Source", source)
    cv2.imshow("OpenCV AR Output", output)
    print("press any key to continue or q to quit")
    key = cv2.waitKey(1) & 0xFF
    if key == ord('q'):
        sys.exit(0)
    else:
        continue

Replacing the Stock Image with a Custom OpenCV Canvas

You can create your own objects and display them instead of a static image, if you like. OpenCV creates a canvas object to represent these graphic panes, including images loaded from files. So, you could remove the source = cv2.imread(args["source"]) line (and the argparse code above it, which parses your command line arguments to obtain the file path of this source image), and replace it with something like source = create_canvas(), which is a function you can create. Here’s a demo function:

def create_canvas(width=640, height=480):
    # Create an empty canvas
    canvas = np.zeros((height, width, 3), dtype=np.uint8)
    
    # Draw text on the canvas
    text = "Canvas for AR"
    text_color = (255, 255, 255)  # White color
    font = cv2.FONT_HERSHEY_SIMPLEX
    font_scale = 1
    thickness = 2
    text_size = cv2.getTextSize(text, font, font_scale, thickness)[0]
    text_x = int((width - text_size[0]) / 2)
    text_y = int((height + text_size[1]) / 2)
    cv2.putText(canvas, text, (text_x, text_y), font, font_scale, text_color, thickness)
    
    # Draw a shape on the canvas (in this case, a rectangle)
    shape_color = (0, 255, 0)  # Green color
    shape_start_point = (50, 50)
    shape_end_point = (width - 50, height - 50)
    cv2.rectangle(canvas, shape_start_point, shape_end_point, shape_color, thickness)
    
    return canvas

Example Code for Measuring WiFi Signal Strength

Below is starter code to return a dictionary of WiFi names (SSID’s) and their signal strengths:

import subprocess
import re
import platform

def get_wifi_info(os_name):
    wifi_info = {}

    # Function to parse Linux Wi-Fi information
    def parse_linux(output):
        networks = {}
        current_ssid = None
        for line in output.split('\n'):
            ssid_match = re.search(r'SSID: (.+)$', line)
            signal_match = re.search(r'signal: (-\d+) dBm', line)
            if ssid_match:
                current_ssid = ssid_match.group(1)
                networks[current_ssid] = None  # Initialize the SSID with no signal strength
            elif signal_match and current_ssid:
                networks[current_ssid] = int(signal_match.group(1))
        return networks

    # Function to parse Mac Wi-Fi information
    def parse_mac(output):
        networks = {}
        for line in output.split('\n')[1:]:  # Skip the header line
            parts = line.strip().split()  # Remove leading spaces and split by whitespace
            if len(parts) >= 5:  # Ensure there are enough parts to include SSID, BSSID, etc.
                # The SSID could contain spaces, so we need to handle it specially
                # Since SSID can contain spaces and we assume it's at the beginning, we'll join all parts but the last five
                bssid = ' '.join(parts[:-5])  # Joining all parts except the last five assuming those are other metrics
                rssi = parts[-5]  # RSSI should now be the fifth last element, assuming fixed format
              
                # Validate and convert RSSI to integer
                try:
                    rssi_int = int(rssi)  # Convert RSSI to integer
                    # Check if BSSID is already in the dictionary, update if existing RSSI is weaker
                    if bssid not in networks or networks[bssid] < rssi_int:
                        networks[bssid] = rssi_int
                except ValueError:
                    # This can happen if RSSI is not a number, so we skip this line
                    continue
        return networks

    # Function to parse Windows Wi-Fi information
    def parse_windows(output):
        networks = {}
        # Windows netsh command has a different output format
        for block in output.split('\n\n'):
            ssid_match = re.search(r'SSID\s+\d+\s+:\s(.+)', block)
            signal_match = re.search(r'Signal\s+:\s(\d+)%', block)
            if ssid_match and signal_match:
                rssi = int(signal_match.group(1))  # Assuming % as signal 'strength'
                rssi = (rssi / 2) - 100 # convert from percentage to dB from -100 to -50
                networks[ssid_match.group(1)] = rssi
        return networks

    if os_name.lower() == 'linux':
        result = subprocess.run(['nmcli', '-t', 'device', 'wifi', 'list'], capture_output=True, text=True)
        wifi_info = parse_linux(result.stdout)
    elif os_name.lower() == 'mac' or os_name.lower() == 'darwin' or os_name.lower() == 'macos':
        result = subprocess.run(['/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport', '-s'], capture_output=True, text=True)
        wifi_info = parse_mac(result.stdout)
    elif os_name.lower() == 'windows':
        result = subprocess.run(['netsh', 'wlan', 'show', 'networks', 'mode=Bssid'], capture_output=True, text=True, shell=True)
        wifi_info = parse_windows(result.stdout)
    else:
        raise ValueError("Unsupported operating system")

    return wifi_info

# Example usage:
os_name = platform.system()  # Automatically detect the OS
wifi_info = get_wifi_info(os_name)
print(wifi_info)

Example Code for Determining the GPS Coordinates Corresponding to Your Location

Below is a function that returns your approximate latitude and longitude if geocoding services are available on your device:

# pip install geocoder
import geocoder

def get_current_location():
    # Attempt to get the user's location using their IP address
    location = geocoder.ip('me')
    
    if location.ok:
        # Return a dictionary containing the latitude and longitude
        return {
            'latitude': location.lat,
            'longitude': location.lng
        }
    else:
        # Return a message indicating that the location could not be determined
        return "Location information is not available."

# Example usage:
print(get_current_location())

Submission

In your submission, please include answers to any questions asked on the assignment page in your README file. If you wrote code as part of this assignment, please describe your design, approach, and implementation in your README file as well. Finally, include answers to the following questions:

Describe what you did, how you did it, what challenges you encountered, and how you solved them.
Please answer any questions found throughout the narrative of this assignment.
If collaboration with a buddy was permitted, did you work with a buddy on this assignment? If so, who? If not, do you certify that this submission represents your own original work?
Please identify any and all portions of your submission that were not originally written by you (for example, code originally written by your buddy, or anything taken or adapted from a non-classroom resource). It is always OK to use your textbook and instructor notes; however, you are certifying that any portions not designated as coming from an outside person or source are your own original work.
Approximately how many hours it took you to finish this assignment (I will not judge you for this at all...I am simply using it to gauge if the assignments are too easy or hard)?
Your overall impression of the assignment. Did you love it, hate it, or were you neutral? One word answers are fine, but if you have any suggestions for the future let me know.
Using the grading specifications on this page, discuss briefly the grade you would give yourself and why. Discuss each item in the grading specification.
Any other concerns that you have. For instance, if you have a bug that you were unable to solve but you made progress, write that here. The more you articulate the problem the more partial credit you will receive (it is fine to leave this blank).

Assignment Rubric

Description	Pre-Emerging (< 50%)	Beginning (50%)	Progressing (85%)	Proficient (100%)
Human-Centric Design (20%)	A trivial application of the modality is provided without regard to proper signifiers or affordances to facilitate human interaction	Some consideration is given to the manner by which augmented reality is incorporated into the program, but it is not clear at all times to the user what to do and how to interact	The user is able to interact with the program using augmented reality in most cases, with a few minor ambiguities that could be identified through additional testing	The user experience is enhanced by the use of augmented reality
Design Report (20%)	No design report is included	A design report is included that describes the approach taken to solving the problem and incorporating augmented reality in a trivial way	A design report is included that describes the approach taken to solving the problem and incorporating augmented reality in a manner that carefully considers the problem from the perspective of one stakeholder	A design report is included that describes the approach taken to solving the problem and incorporating augmented reality through documented discussions and test cases with a variety of stakeholders
Algorithm Implementation (30%)	The algorithm fails on the test inputs due to major issues, or the program fails to compile and/or run	The algorithm fails on the test inputs due to one or more minor issues	The algorithm is implemented to solve the problem correctly according to given test inputs, but would fail if executed in a general case due to a minor issue or omission in the algorithm design or implementation	A reasonable algorithm is implemented to solve the problem which correctly solves the problem according to the given test inputs, and would be reasonably expected to solve the problem in the general case
Code Quality and Documentation (20%)	Code commenting and structure are absent, or code structure departs significantly from best practice, and/or the code departs significantly from the style guide	Code commenting and structure is limited in ways that reduce the readability of the program, and/or there are minor departures from the style guide	Code documentation is present that re-states the explicit code definitions, and/or code is written that mostly adheres to the style guide	Code is documented at non-trivial points in a manner that enhances the readability of the program, and code is written according to the style guide
Writeup and Submission (10%)	An incomplete submission is provided	The program is submitted, but not according to the directions in one or more ways (for example, because it is lacking a readme writeup or missing answers to written questions)	The program is submitted according to the directions with a minor omission or correction needed, including a readme writeup describing the solution and answering nearly all questions posed in the instructions	The program is submitted according to the directions, including a readme writeup describing the solution and answering all questions posed in the instructions

Please refer to the Style Guide for code quality examples and guidelines.